Subgroup Identification Analysis

Code
knitr::opts_chunk$set(echo = TRUE, message=FALSE, warning=FALSE)

options(warn = -1)

rm(list=ls())

library(tinytex)
library(ggplot2)
library(table1)

library(gt)

# test these packages
# library(tidyverse)
# library(plyr)
# library(dplyr)
# library(glmnet)

library(survival)
library(data.table)
library(randomForest)
library(grf)
library(policytree)
library(DiagrammeR)

library(grid)
library(forestploter)
library(randomizr)

codepath <- c("Rcode/forestsearch/R/")
source(paste0(codepath,"source_forestsearch_v0.R"))
source_fs_functions(file_loc=codepath)

source("Rcode/kmplotting/R/KMplotting_functions_v0.R")

# Save output flag
output <- TRUE
# Loading results
loadit <- FALSE

# Record time of all analyses
tALL.start<-proc.time()[3]
 
#library(doParallel)
#cat("Number of cores=",c(detectCores()),"\n")

# Note: leave workers unspecified
# on my linux machine workers(=100) needs to be set
plan(multisession)
#plan(multisession, workers=120)

nsims <- 1000
nboots <- 1000

Weibull AFT/Cox model with biomarker effects

Brief Review

For the Weibull distribution with shape parameter \(\nu\) and scale parameter \(\theta\) the density, cdf, survival, hazard and cumulative hazard functions are: \[\begin{eqnarray*} f(t) &=& \nu \theta^{-\nu}t^{\nu-1}\exp(-(t/\theta)^{\nu}), \cr F(t) &=& \int_{0}^{t}\nu \theta^{-\nu}s^{\nu-1}\exp(-(s/\theta)^{\nu})ds \cr &=& \int_{0}^{(t/\theta)^{\nu}}e^{-w}dw \cr & & (w=(s/\theta)^{\nu},dw=\nu s^{\nu-1}\theta^{-\nu}ds) \cr &=&1-\exp(-(t/\theta)^{\nu}), \cr S(t) &=& \exp(-(t/\theta)^{\nu}), \cr \lambda(t)&=&\nu \theta^{-\nu}t^{\nu-1}, \cr \Lambda(t) &=& -\log(S(t))=(t/\theta)^{\nu}. \end{eqnarray*}\] Note that here we define the density to correspond with R’s definition. For shape parameter \(\nu \in (0,1)\) the hazard is strictly decreasing in \(t \geq 0\), whereas for \(\nu >1\) the hazard is strictly increasing in \(t \geq 0\).

Note: \(\Lambda(T) \sim E(1)\)

The cumulative hazard function \(\Lambda(\cdot)\) evaluated at \(T\), \(\Lambda(T)\) as a random variable, has cdf \[\eqalign{ \Pr(\Lambda(T) \leq t) &=\Pr(-\log(1-F(T)) \leq t) =\Pr(1-F(T) \geq e^{-t}) \cr &=\Pr(T \leq F^{-1}(1-e^{-t})) =F(F^{-1}(1-e^{-t})) = 1-e^{-t}, \cr}\] which is the CDF of the exponential distribution, \(E(1)\) (say).

In the following we use

\[\begin{equation} \tag{1} \Lambda(T) \sim E(1) \end{equation}\] to represent the Weibull regression model as an AFT model which is also a Cox model.

Now, \(\Lambda(T)=(T/\theta)^{\nu}\) and write \(W=-\log(S(T))=\Lambda(T)=(T/\theta)^{\nu}\), where from (1), \(W \sim E(1)\). That is \(\log(W)=\nu(\log(T)-\log(\theta))\) can be expressed as

\[\begin{equation} \tag{2} \log(T)=\log(\theta)+ \tau \log(W) := \log(\theta) + \tau \epsilon, \end{equation}\] where \(\tau=1/\nu\) and it is easy to show that \(\epsilon=\log(W)\) has the ``extreme value’’ distribution with density \(f_{\epsilon}(x)=\exp(x-e^{x})\) for \(x \in {\cal R}\). Here the range of \(\log(T) \in {\cal R}\) is un-restricted. The survReg routine uses the parameterization \((2)\) and therefore estimates \(\log(\theta)\) and \(\tau=1/\nu\).

To incorporate covariates \(L\) (say) we specify \[\lambda(t;L)=\big(\nu \theta^{-\nu}t^{\nu-1} \big) \exp(L'\beta) := \lambda_{0}(t)\exp(L'\beta),\] where \(\lambda_{0}(t)\) is the hazard, say, for \(T_{0} \sim \hbox{Weibull}(\nu,\theta)\). This is a special case of the proportional hazards model. The chf (conditional chf with covariate vector \(L\)) is \(\Lambda(t;Z)=(t/\theta)^{\nu}\exp(L'\beta)\) so that analogous to above this leads to the representation

\[\begin{equation} \tag{3} \log(T) =\log(\theta)+\tau[-L'\beta+\epsilon] =\log(\theta)+L'\gamma + \tau \epsilon, \end{equation}\] where \(\gamma=-\tau\beta\), with \(\tau\) and \(\epsilon\) defined in (2). R survReg uses this AFT parameterization so that the estimated components of \(\gamma\), \(\gamma_{p}\) say, are that of \(-\tau{\times}\beta_{p}\) for \(p=1,\ldots,m\) (\(m\) is dimension of \(\beta\)).

When fitting the AFT model (3) via suvreg we therefore transform parameters \(\hat\gamma\) to the Weibull hazard-ratio parameterization (2) via

\[\begin{equation} \tag{4} \hat\beta = -\hat\gamma / \hat{\tau}. \end{equation}\]

As an illustration we compare the survReg model fits for the case-study dataset. The following table below compares the Weibull survReg model fits with covariates Treat and Ecog1 (Ecog = 1 vs 0) where components of \(\hat\gamma\) from model (3) are calculated according to survReg and \(\hat\beta\) are formed via (4). In the table below Weibull estimates of \(\hat\beta\) are compared to Cox model versions.

Code
# case-study example
df.case <- read.table("simdatasource/dfexplore.csv", header=TRUE, sep=",")
#names(df.case)
Code
# Comparing Weibull vs Cox with case-study where outcomes are artifical
# This is just for illustration to show conversion of Weibull parameters from 
# AFT regression to Weibull hazard 
fit.weib_ex <- survreg(Surv(tte,pmax(event,1)) ~ treat + ecog1, dist='weibull', data=df.case)
tauhat <- fit.weib_ex$scale
# convert (treat,ecog1) regression parms to Weibull hazard-ratio
bhat.weib <- -(1)*coef(fit.weib_ex)[c(2,3)]/tauhat
# Compare to Cox 
fit.cox_ex <- coxph(Surv(tte,pmax(event,1)) ~ treat + ecog1, data=df.case)
res <- cbind(bhat.weib,coef(fit.cox_ex))
res <- as.data.frame(res)
colnames(res)<-c("Weibull","Cox")
res |> gt() |>
fmt_number(columns=1:2,decimals=6) |>
tab_header(title="Comparing Weibull to Cox hazard ratio estimates",
subtitle="Case-study dataset with artificial outcomes")
Comparing Weibull to Cox hazard ratio estimates
Case-study dataset with artificial outcomes
Weibull Cox
0.157144 0.176033
0.472355 0.472196

Biomarker effects with spline model

We now outline how potential outcomes are simulated according to parameters fit to the case-study dataset but with parameters specified to induce biomarker effects. That is, causal treatment effects (on log(hazard-ratio) scale) that follow a spline model according to patterns where biomarker effects increase with biomarker levels; Including various degrees of limited treatment effects for low biomarker levels.

We first consider a Weibull model with treatment and a single biomarker covariate \(Z\) where we write the linear predictor of the Cox model \(L'\beta\) (say) as

\[\begin{equation} \tag{5} L'\beta := \beta_{1}\hbox{Treat} + \beta_{2}\hbox{Z} + \beta_{3}\hbox{Z}\hbox{Treat} + \beta_{4}(\hbox{Z}-k)I(\hbox{Z}>k) + \beta_{5}(\hbox{Z}-k)I(\hbox{Z}>k)\hbox{Treat}. \end{equation}\]

Following the potential-outcome approach let \(l_{x,z}\) denote subject’s hazard-function “had they followed treatment regimen \(Treat=x\) while having biomarker level \(Z=z\)”. That is, for subject with biomarker level \(Z=z\) we can simulate their survival outcomes under both treatment (\(x=1\)) and control (\(x=0\)) conditions. Let \(\beta^{0} = (\beta_{1}^{0},\ldots,\beta_{5}^{0})'\) denote the true coefficients and denote the hazard function as

\[\lambda_{x,z}(t) = \lambda_{0}(t)\exp(l_{x,z}), \quad \hbox{say}.\]

Writing

\[\begin{equation} l_{x,z} = \beta^{0}_{1}x + \beta^{0}_{2}z + \beta^{0}_{3}zx + \beta^{0}_{4}(z-k)I(z>k) + \beta^{0}_{5}(z-k)I(z>k)x, \end{equation}\] the log of the hazard ratio for biomarker level \(z\) under treatment (\(x=1\)) relative to control (\(x=0\)) is given by

\[\begin{equation} \tag{6} \psi^{0}(z) := \log(\lambda_{1,z}(t)/\lambda_{0,z}(t)) = \beta^{0}_{1} + \beta^{0}_{3}z + \beta^{0}_{5}(z-k)I(z>k). \end{equation}\]

The log(hazard-ratio) for biomarker levels \(z\) is a linear function of \(z\) with a change-point (in slope) at \(z=k\) given by

\[\begin{eqnarray*} \psi^{0}(z) &=& \beta^{0}_{1} + \beta^{0}_{3}z, \quad \hbox{for} \ z \leq k, \cr &=& \beta^{0}_{1} + \beta^{0}_{3}z + \beta^{0}_{5}(z-k), \quad \hbox{for} \ z > k. \end{eqnarray*}\]

Log hazard-ratio parameters \((\beta^{0}_{1},\beta^{0}_{3},\beta^{0}_{5})\) can be chosen to generate “treatment effect patterns” by specifying \(\psi^{0}(z)\) values at \(z=0\), \(z=k\), and \(z=\zeta\) for \(\zeta > k\). For specified \(\psi^{0}(0)\), \(\psi^{0}(k)\), and \(\psi^{0}(\zeta)\) we have

\[\begin{eqnarray} \beta^{0}_{1} &=& \psi^{0}(0), \cr \beta^{0}_{3} &=& {(\psi^{0}(k) - \beta^{0}_{1}) \over k}, \cr \beta^{0}_{5} &=& {(\psi^{0}(\zeta) - \beta^{0}_{1} - \beta^{0}_{3}\zeta) \over (\zeta -k)}. \end{eqnarray}\]

The function get_dgm_stratified generates \(\psi^{0}(z)\) according to desired “biomarker treatment effect patterns” as follows.

  • Let \(X\) and \(Z\) denote the treatment and biomarker variables in the case-study dataset and for specified \(k\), form the covariates \(L:=(X,Z,ZX,(Z-k)I(Z>k),(Z-k)I(Z>k)X))\);
  • Fit the Weibull model (recall on AFT scale) to get \(\log(\hat\theta)\), \(\hat\tau=1/\hat{\nu}\), and \(\hat\gamma\) corresponding to \(L\);
  • \(\hat\gamma\) is in terms of the AFT parameterization given by model (3)
  • Next transform to the Weibull (Cox) log hazard-ratio parameterization (4): \(\hat\beta = -\hat\gamma/\hat\tau\)
  • Set “true” parameters \(\theta^{0}=\hat\theta\), and \(\tau^{0}=\hat\tau\)
  • Initialize the “true” parameter \(\beta^{0} = \hat\beta\) and re-define parameters 1, 3, and 5 in order to satisfy specified \(\psi^{0}(0)\), \(\psi^{0}(k)\), and \(\psi^{0}(\zeta)\): \(\beta^{0}[1] = \psi^{0}(0)\), \(\beta^{0}[3] = (\psi^{0}(k) - \beta^{0}[1])/k\), and \(\beta^{0}[5] = (\psi^{0}(\zeta) - \beta^{0}[1] - \beta^{0}[3]\zeta)/(\zeta -k)\);
  • Form corresponding \(\gamma^{0}= -\beta^{0}\tau^{0}\)
  • For simulations we use the AFT parameterization (3) to generate \(\log(T)\) outcomes according to \(\log(T) = \log(\theta^{0}) + L'\gamma^{0} + \tau^{0}\epsilon\) where recall \(\epsilon\) has the ``extreme value’’ distribution.

The inputs of the get_dgm_stratified function are:

  • The case-study dataset (“df”)
  • “knot”=\(k\), “zeta”=\(\zeta\), and “log.hrs”= (\(\psi^{0}(0),\psi^{0}(k),\psi^{0}(\zeta))\)
  • Note that get_dgm_stratified allows for outcomes to follow stratified Weibull (Cox) models in which case the log(hazard-ratio) effects will depend on the strata, “strata_tte”, where the default is non-stratified (“strata_tte=NULL”)
  • If stratified the \(\tau\) parameters and hence \(\gamma^{0}\) depend on the strata
  • If stratified, then to simplify, the \(\gamma^{0}\) effects are calculated based on the median of the \(\tau^{0}\)’s (\(\tau^{0}=\tau_{med}\), say).

To-do –> explain outputs and how used in draw_sim_stratified

Example where treatment effects increase with increasing biomarker

  • Here we set \(\psi^{0}(10)=\log(0.5)\) so that treatment effects are increasing (hazard-ratio \(< 0.5\)) with higher values;
  • But below \(\psi^{0}(5)=\log(1.25)\) for levels \(z \leq 5\);
  • Such that \(\psi^{0}(0)=\log(3)\)
Code
df.case <- within(df.case,{
  z <- pmin(biomarker,20)  # eg, PDL-1 expression truncated at 20
  CTregimen <- ifelse(mAD_CTregimen1==0,"socA","socB") # Outcome stratification (Weibull/Cox model)
  })

# These are confounders (fixed) that were characteristics from observed data
confounders.name <- c("age","biomarker","male","EU_region","AP_region","histology","surgery","ecog1","mAD_CTregimen1")

# Outcome model is NON-stratified (strata_tte=NULL)

# Strong biomarker 

log.hrs <- log(c(3,1.25,0.50))

dgm <- get_dgm_stratified(df=df.case,zeta=10,knot=5,log.hrs=log.hrs,strata_tte=NULL,details=FALSE)

# Simulate with randomization factors "stratum" 
# Note that strata_tte="CTregimen" is a simplification (collapsing over region and metastatic) of "stratum"
# xname="AP_region" specifies prognostic effect factor
# bx=log(5) denotes log(hazard ratio) relative effect with respect to xname factor

# Draw very large-sample to get approximation to bias
# Note: return_df will NOT return the large simulated dataset but the "large-sample" estimates as a list
# checking=TRUE will compare the estimated tau (stratified) parameters based on the case-study dataset to 
# the versions based on the simulated dataset
# details=TRUE will output the approximations to Cox model estimators from several models ("population summaries")
# return_df =FALSE will return the "population summaries" as a list as well as the biomarker AHR functional (see below)

# Strong prognostic factor 
pop_summary <- draw_sim_stratified(dgm=dgm,ss=1,Ndraw=100000,wname="AP_region",bw=-log(5),strata_rand="stratum",checking=TRUE,details=TRUE,return_df=FALSE)
% censored = 0.20918 
Stratification parm (taus) df_super 0.9345439 
Stratification parm (taus) simulated= 1.013666 
Max |loghr.po - (log.Y0-log.Y1)/tau| =  0 
          rn      Weibull          Cox
      <char>        <num>        <num>
1:     treat -0.042255946 -0.038177033
2:         z  0.029601994  0.027480098
3:   z.treat -0.004467330 -0.004397499
4:       z.k -0.136372426 -0.131144452
5: z.k.treat  0.008684274  0.008493494
6:         w -1.447437967 -1.410005289
Overall empirical AHR, CDE*, CDE**= 0.7367854 0.7367854 1.258119 
CDE* W=1, W=0 0.7206492 0.7420908 
CDE** W=1, W=0 1.197564 1.262121 
Cox ITT: Un-adjusted, sR, W 0.7903797 0.7641313 0.7457578 
Cox W=1 Sub-population 1.015991 
Cox W=0 Sub-population 0.718739 

Code
# An illustrative simulated example dataset
df_example <- draw_sim_stratified(dgm=dgm,ss=123,wname="AP_region",bw=-log(5),strata_rand="stratum")

# Subgroups identified with nonAP population
df_nonAP <- subset(df_example,AP_region==0)
# Applied to AP
df_AP <- subset(df_example,AP_region==1)
df_ITT <- df_example


# True causal AHR
ahr_true = c(round(pop_summary$AHR,4))
# AHR Cox un-adjusted (only treatment)
ahr_unadj = c(round(pop_summary$ITT_unadj,4))
# Treatment plus randomization stratificaton sR
ahr_sR = c(round(pop_summary$ITT_sR,4))
# sR including adjustment by x
ahr_sRw = c(round(pop_summary$ITT_sRw,4))

# Relative biases 
bias_unadj <-round(100*c(pop_summary$ITT_unadj-pop_summary$AHR)/pop_summary$AHR,1)
bias_sR <-round(100*c(pop_summary$ITT_sR-pop_summary$AHR)/pop_summary$AHR,1)
bias_sRw <-round(100*c(pop_summary$ITT_sRw-pop_summary$AHR)/pop_summary$AHR,1)

# The bias within X=1 population
ahr_w1 = c(round(pop_summary$W_1,4))
bias_w1 <-round(100*c(pop_summary$W_1-pop_summary$AHR)/pop_summary$AHR,1)

We define the biomarker average hazard ratio (AHR) as the expected value of \(\psi^{0}(\cdot)\) across “biomarker positive” sub-populations. For example, \(\hbox{AHR}(2^{+})\) represents the AHR for subjects with biomarker values \(\geq 2\).

\[\hbox{AHR}(z^{+}):= \exp\left\{E_{Z \geq z} \psi^{0}(Z) \right\}.\]

Code
# non-stratified

plot(pop_summary$zpoints,pop_summary$HR.zpoints,xlab="z",ylab="Average hazard ratio (AHR)",type="s",lty=1,col="black",lwd=2)
rug(jitter(df.case$z))
title(main="AHR(z+)")

  • Overall Population (large-sample) –
    • The overall true (causal) average hazard-ratio (AHR) = 0.7368
    • The \(\approx\) asymptotic limit of un-adjusted Cox (only treatment) = 0.7904
    • The \(\approx\) asymptotic limit of stratified Cox (treatment and stratified by stratum) = 0.7641
    • The \(\approx\) asymptotic limit of stratified Cox with adjustment by \(x\) = 0.7458

The relative over-estimation biases for the Cox model estimates when un-adjusted, stratified (by randomization factors), and stratified + adjusted by the strong prognostic factor \(X\) are: 7.3%, 3.7%, and 1.2%, respectively.

Moreover, within the AP_region (\(X=1\)) the \(\approx\) limit of the Cox model estimator based on the (AP_region) regional local data is 1.016 with corresponding bias of 37.9%.

We would therefore expect challenges with establishing consistency based on evaluating the local regional data.

Lets check if there is any imbalance (asymptotically) by biomarker by drawing a large sample (N=100,000) and viewing summary tables:

Code
# This time return large dataset (corresponding to that used in the above asymptotic approximations)
dflarge <- draw_sim_stratified(dgm=dgm,ss=1,Ndraw=100000,wname="AP_region",bw=-log(5),strata_rand="stratum",checking=FALSE,details=FALSE,return_df=TRUE)
# create factor versions of treatment and AP region
dflarge$trtsim <- ifelse(dflarge$treat.sim==1,"Experimental","Control")
dflarge$apregion <- ifelse(dflarge$AP_region==1,"AP","non-AP")
dflarge$bm_lt2 <- ifelse(dflarge$biomarker<2,"biomarker<2","biomarker>=2")
table1( ~ biomarker + bm_lt2 | trtsim, data=dflarge, caption=c("ITT by treatment arm"))
ITT by treatment arm
Control
(N=50001)
Experimental
(N=49999)
Overall
(N=100000)
biomarker
Mean (SD) 13.8 (22.4) 13.9 (22.5) 13.9 (22.4)
Median [Min, Max] 5.00 [0, 100] 5.00 [0, 100] 5.00 [0, 100]
bm_lt2
biomarker<2 11930 (23.9%) 11990 (24.0%) 23920 (23.9%)
biomarker>=2 38071 (76.1%) 38009 (76.0%) 76080 (76.1%)
Code
table1( ~ biomarker + bm_lt2 | apregion, data=dflarge, caption=c("ITT by AP region"))
ITT by AP region
AP
(N=24472)
non-AP
(N=75528)
Overall
(N=100000)
biomarker
Mean (SD) 11.2 (15.3) 14.7 (24.2) 13.9 (22.4)
Median [Min, Max] 5.00 [1.00, 80.0] 5.00 [0, 100] 5.00 [0, 100]
bm_lt2
biomarker<2 4922 (20.1%) 18998 (25.2%) 23920 (23.9%)
biomarker>=2 19550 (79.9%) 56530 (74.8%) 76080 (76.1%)
Code
table1( ~ biomarker + bm_lt2 | trtsim, data=subset(dflarge,AP_region==1), caption=c("AP region by treatment arm"))
AP region by treatment arm
Control
(N=12237)
Experimental
(N=12235)
Overall
(N=24472)
biomarker
Mean (SD) 11.2 (15.3) 11.2 (15.3) 11.2 (15.3)
Median [Min, Max] 5.00 [1.00, 80.0] 5.00 [1.00, 80.0] 5.00 [1.00, 80.0]
bm_lt2
biomarker<2 2442 (20.0%) 2480 (20.3%) 4922 (20.1%)
biomarker>=2 9795 (80.0%) 9755 (79.7%) 19550 (79.9%)
Code
rm("dflarge")

Next, a simulated example with same sample size as the case-study

  • Here we simulated a study according to the biomarker effects described above
  • Summarize the non-AP and AP populations
  • Apply subgroup identification procedures to the non-AP data and apply to AP
Code
# non-stratified
kmH.fit<-KM.plot.2sample.weighted(df=df_example, 
tte.name="y.sim", event.name="event.sim", treat.name="treat.sim",
risk.set=TRUE, by.risk=6,
details=TRUE,
Xlab="Months",Ylab="Overall survival",
arms=c("Experimental","Control"))
Cox un-adjusted HR= 0.76 
Cox CIs= 0.624 0.925 
My p-value and survdiff= 0.006076202 0.006076202 
My z^2 and survdiff= 7.527564 7.527564 
Comparing with R's survfit 
Treat=1: Max error max|KM(survfit)[t]-KM(mine)[t]| = 0 
Treat=0: Max error max|KM(survfit)[t]-KM(mine)[t]| = 0 
Comparing with R's survfit (experimental 
Treat=1: n,events= 252 185 
Median, Lower, Upper= 17.19713 14.14832 21.0917 
survfit medians 
  records     n.max   n.start    events     rmean se(rmean)    median   0.95LCL 
   252.00    252.00    252.00    185.00     27.33      1.55     17.20     14.15 
  0.95UCL 
    21.09 
Comparing with R's survfit (Control 
Treat=0: n,events= 255 217 
Median, Lower, Upper= 13.42839 11.46187 17.93717 
survfit medians 
  records     n.max   n.start    events     rmean se(rmean)    median   0.95LCL 
   255.00    255.00    255.00    217.00     21.64      1.29     13.43     11.46 
  0.95UCL 
    17.94 
Code
title(main="ITT Population")

Now evaluate non-AP and AP —> AP is “diluted”

Code
# non-stratified
par(mfrow=c(1,2))
kmH.fit<-KM.plot.2sample.weighted(df=df_nonAP, 
tte.name="y.sim", event.name="event.sim", treat.name="treat.sim",
risk.set=TRUE, by.risk=6, details=FALSE,
show.logrank=FALSE, put.legend.arms="top",
Xlab="Months",Ylab="Overall survival",
arms=c("Experimental","Control"))
title(main="Non-AP Population")

kmH.fit<-KM.plot.2sample.weighted(df=df_AP, 
tte.name="y.sim", event.name="event.sim", treat.name="treat.sim",
risk.set=TRUE, by.risk=6, details=FALSE,
show.logrank=FALSE, put.legend.arms="top",
Xlab="Months",Ylab="Overall survival",
arms=c("Experimental","Control"))
title(main="AP Population")

Note that results stratified by randomization (or true outcome strata) are similar

Code
# stratified
par(mfrow=c(1,2))
kmH.fit<-KM.plot.2sample.weighted(df=df_nonAP, 
tte.name="y.sim", event.name="event.sim", treat.name="treat.sim",strata.name="strata.simR",
risk.set=TRUE, by.risk=6, details=FALSE,
show.logrank=FALSE, put.legend.arms="top",
Xlab="Months",Ylab="Overall survival",
arms=c("Experimental","Control"))
title(main="Non-AP Population")

kmH.fit<-KM.plot.2sample.weighted(df=df_AP, 
tte.name="y.sim", event.name="event.sim", treat.name="treat.sim",strata.name="strata.simR",
risk.set=TRUE, by.risk=6, details=FALSE,
show.logrank=FALSE, put.legend.arms="top",
Xlab="Months",Ylab="Overall survival",
arms=c("Experimental","Control"))
title(main="AP Population")

Balance within simulated trial

Code
# This time return large dataset (corresponding to that used in the above asymptotic approximations)
df <- df_example
# create factor versions of treatment and AP region
df$trtsim <- ifelse(df$treat.sim==1,"Experimental","Control")
df$apregion <- ifelse(df$AP_region==1,"AP","non-AP")
df$bm_lt2 <- ifelse(df$biomarker<2,"biomarker<2","biomarker>=2")
table1( ~ biomarker + bm_lt2 | trtsim, data=df, caption=c("ITT by treatment arm"))
ITT by treatment arm
Control
(N=255)
Experimental
(N=252)
Overall
(N=507)
biomarker
Mean (SD) 15.7 (24.2) 11.8 (20.1) 13.8 (22.3)
Median [Min, Max] 5.00 [0, 100] 5.00 [1.00, 100] 5.00 [0, 100]
bm_lt2
biomarker<2 59 (23.1%) 61 (24.2%) 120 (23.7%)
biomarker>=2 196 (76.9%) 191 (75.8%) 387 (76.3%)
Code
table1( ~ biomarker + bm_lt2 | apregion, data=df, caption=c("ITT by AP region"))
ITT by AP region
AP
(N=125)
non-AP
(N=382)
Overall
(N=507)
biomarker
Mean (SD) 11.0 (15.2) 14.7 (24.1) 13.8 (22.3)
Median [Min, Max] 5.00 [1.00, 80.0] 5.00 [0, 100] 5.00 [0, 100]
bm_lt2
biomarker<2 25 (20.0%) 95 (24.9%) 120 (23.7%)
biomarker>=2 100 (80.0%) 287 (75.1%) 387 (76.3%)
Code
table1( ~ biomarker + bm_lt2 | trtsim, data=subset(df,AP_region==1), caption=c("AP region by treatment arm"))
AP region by treatment arm
Control
(N=62)
Experimental
(N=63)
Overall
(N=125)
biomarker
Mean (SD) 12.9 (17.7) 9.19 (12.1) 11.0 (15.2)
Median [Min, Max] 5.00 [1.00, 80.0] 5.00 [1.00, 71.0] 5.00 [1.00, 80.0]
bm_lt2
biomarker<2 10 (16.1%) 15 (23.8%) 25 (20.0%)
biomarker>=2 52 (83.9%) 48 (76.2%) 100 (80.0%)

Let’s look at 9 simulated AP realizations

Code
par(mfrow=c(3,3))

for(ss in 1:9){
df_ex <- draw_sim_stratified(dgm=dgm,ss=ss,wname="AP_region",bw=-log(5),strata_rand="stratum")
kmH.fit<-KM.plot.2sample.weighted(df=subset(df_ex,AP_region==1), 
tte.name="y.sim", event.name="event.sim", treat.name="treat.sim",strata.name="strata.simR",
risk.set=TRUE, by.risk=6, details=FALSE, show.med=FALSE, cox.cex= 1.25,
show.logrank=FALSE, put.legend.arms="top",
Xlab="Months",Ylab="Overall survival",
arms=c("Experimental","Control"))
title(main="AP Population")
}

What about null prognostic (AP region) factor \(\beta_{w} = 0\)

Code
par(mfrow=c(3,3))

for(ss in 1:9){
df_ex <- draw_sim_stratified(dgm=dgm,ss=ss,wname="AP_region",bw=-log(1),strata_rand="stratum")
kmH.fit<-KM.plot.2sample.weighted(df=subset(df_ex,AP_region==1), 
tte.name="y.sim", event.name="event.sim", treat.name="treat.sim",strata.name="strata.simR",
risk.set=TRUE, by.risk=6, details=FALSE,show.med=FALSE, cox.cex= 1.25,
show.logrank=FALSE, put.legend.arms="top",
Xlab="Months",Ylab="Overall survival",
arms=c("Experimental","Control"))
title(main="AP Population")
}

Compare non-parametric (cubic-spline) fit with true \(\psi^{0}\) for ITT dataset

Biomarker effects?

Code
# ydel (default=0.25) is how much room to provide in space for counts
temp <- cox_cs_fit2(df=df_example,tte.name="y.sim",event.name="event.sim",treat.name="treat.sim",
strata.name="strata.simO",z.name=c("z"), details=FALSE ,boots=0, xlab=c("Biomarker"),maxz = 25, ydel=0.5, show_plot=TRUE,truebeta.name="loghr.po")
title("ITT")

Cubic-spline fit to non-AP and AP regions

Spline model suggests benefit in AP region

Code
par(mfrow=c(1,2))

temp <- cox_cs_fit(df=df_nonAP,tte.name="y.sim",event.name="event.sim",treat.name="treat.sim",
strata.name="strata.simO",z.name=c("z"), details=FALSE ,boots=0, xlab=c("Biomarker"),maxz = 25, ydel=0.5, show_plot=TRUE,truebeta.name="loghr.po")
title("Non-AP")


temp <- cox_cs_fit(df=df_AP,tte.name="y.sim",event.name="event.sim",treat.name="treat.sim",
strata.name="strata.simO",z.name=c("z"), details=FALSE ,boots=0, xlab=c("Biomarker"),maxz = 25, ydel=0.5, show_plot=TRUE,truebeta.name="loghr.po")
title("AP")

Examine Non-AP data used for subgroup identification (“training data”): Forestplot of individual factors and baseline demographics

Code
fp <- forestplot_baseline(df=df_nonAP,confounders.name=confounders.name,
arrow_text=c("favors Treatment","Control"),E.name=c("Treatment"),C.name=c("Control"),outcome.name="y.sim",event.name="event.sim",treat.name="treat.sim",
xlim = c(0.25, 2.0), ticks_at = c(0.65, 1.0, 1.3))

plot(fp$p)

Code
aa <- paste(confounders.name, collapse=" + ")
aa <- paste0("~ " ,aa)
aa <- paste0(aa," | treat_name")
aa <- as.formula(aa)

table1( aa , data=df_nonAP, caption=c("Simulated dataset: non-AP (training data)"))
Simulated dataset: non-AP (training data)
Control
(N=189)
Experimental
(N=193)
Overall
(N=382)
age
Mean (SD) 59.2 (13.0) 60.0 (11.6) 59.6 (12.3)
Median [Min, Max] 62.0 [23.0, 87.0] 61.0 [22.0, 83.0] 61.0 [22.0, 87.0]
biomarker
Mean (SD) 12.4 (21.0) 16.9 (26.7) 14.7 (24.1)
Median [Min, Max] 5.00 [1.00, 100] 5.00 [0, 100] 5.00 [0, 100]
male
Mean (SD) 0.730 (0.445) 0.751 (0.433) 0.741 (0.439)
Median [Min, Max] 1.00 [0, 1.00] 1.00 [0, 1.00] 1.00 [0, 1.00]
EU_region
Mean (SD) 0.481 (0.501) 0.420 (0.495) 0.450 (0.498)
Median [Min, Max] 0 [0, 1.00] 0 [0, 1.00] 0 [0, 1.00]
AP_region
Mean (SD) 0 (0) 0 (0) 0 (0)
Median [Min, Max] 0 [0, 0] 0 [0, 0] 0 [0, 0]
histology
Mean (SD) 0.344 (0.476) 0.404 (0.492) 0.374 (0.485)
Median [Min, Max] 0 [0, 1.00] 0 [0, 1.00] 0 [0, 1.00]
surgery
Mean (SD) 0.233 (0.424) 0.264 (0.442) 0.249 (0.433)
Median [Min, Max] 0 [0, 1.00] 0 [0, 1.00] 0 [0, 1.00]
ecog1
Mean (SD) 0.587 (0.494) 0.611 (0.489) 0.599 (0.491)
Median [Min, Max] 1.00 [0, 1.00] 1.00 [0, 1.00] 1.00 [0, 1.00]
mAD_CTregimen1
Mean (SD) 0.476 (0.501) 0.487 (0.501) 0.482 (0.500)
Median [Min, Max] 0 [0, 1.00] 0 [0, 1.00] 0 [0, 1.00]

Finding Non-AP subgroup with highest consistency rate (in favor of control)

In the following we evaluate subgroups based on single-factors (e.g., “biomarker < J” say, or “Age <= 65”, etc) and select the subgroup with the highest consistency rate with Cox hazard-ratio estimate meeting selection criteria; The selected subgroup with highest consistency rate meeting hazard ratio estimate criterion (log hazard-ratio estimate \(\log(\hat\beta) \geq log(0.90)\), say) where the “consistency rate” is at least \(90\)%.

To do —> Refer criterion (hazard ratio thresholds) to equations in paper León et al. (2024)

Code
# Show the first (up to) 10 subgroups ("showten_subgroups=TRUE") meeting identification criteria 
# Subgroups based on only single-factors ("maxk=1")

# Note that sg_focus= "hr" sorts the subgroups by consistency rate and largest hazard-ratio estimates  
# Selects the subgroup with the largest consistency rate;  If there are ties among consistency rates, then 
# the subgroup with largest HR estimate is selected estimate satisfying the (minimum) consisency criteria 
# (pconsistency.threshold)

# Note: we exclude cuts for biomarker which are "<=" in order to only consider "<" 
fs <- forestsearch(df.analysis=df_nonAP, df.test=df_AP, confounders.name=confounders.name,
outcome.name="y.sim",treat.name="treat.sim", event.name="event.sim",
potentialOutcome.name="loghr.po",
hr.threshold = 0.90, hr.consistency = 0.80, pconsistency.threshold = 0.90,
sg_focus = "hr",
showten_subgroups=TRUE, details=TRUE,
conf_force=c("age <= 65","biomarker < 1","biomarker < 2","biomarker < 3",
"biomarker < 4","biomarker < 5","biomarker < 6","biomarker < 7","biomarker < 8",
"biomarker < 9","biomarker < 10"), exclude_cuts = c("biomarker <="),
maxk=1, plot.sg=FALSE)
# of continuous/categorical characteristics 2 7 
Continuous characteristics: age biomarker 
Categorical characteristics: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Default cuts included (1st 20) age <= mean(age) age <= median(age) age <= qlow(age) age <= qhigh(age) biomarker <= mean(biomarker) biomarker <= median(biomarker) biomarker <= qlow(biomarker) biomarker <= qhigh(biomarker) 
Categorical: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Dropping variables (only 1 level): AP_region 
# of candidate subgroup factors= 21 
 [1] "age <= 65"      "biomarker < 1"  "biomarker < 2"  "biomarker < 3" 
 [5] "biomarker < 4"  "biomarker < 5"  "biomarker < 6"  "biomarker < 7" 
 [9] "biomarker < 8"  "biomarker < 9"  "biomarker < 10" "age <= 59.6"   
[13] "age <= 61"      "age <= 52"      "age <= 68"      "male"          
[17] "EU_region"      "histology"      "surgery"        "ecog1"         
[21] "mAD_CTregimen1"
Number of factors evaluated= 21 
Confounders per grf screening q9 q8 q10 q11 q7 q21 q17 q18 q20 q13 q12 q14 q1 q3 q15 q6 q19 q4 q16 q5 q2 
          Factors Labels VI(grf)
9   biomarker < 8     q9  0.2154
8   biomarker < 7     q8  0.1573
10  biomarker < 9    q10  0.1192
11 biomarker < 10    q11  0.1111
7   biomarker < 6     q7  0.0779
21 mAD_CTregimen1    q21  0.0427
17      EU_region    q17  0.0349
18      histology    q18  0.0313
20          ecog1    q20  0.0300
13      age <= 61    q13  0.0283
12    age <= 59.6    q12  0.0242
14      age <= 52    q14  0.0203
1       age <= 65     q1  0.0180
3   biomarker < 2     q3  0.0167
15      age <= 68    q15  0.0146
6   biomarker < 5     q6  0.0130
19        surgery    q19  0.0117
4   biomarker < 3     q4  0.0112
16           male    q16  0.0111
5   biomarker < 4     q5  0.0110
2   biomarker < 1     q2  0.0000
Number of possible configurations (<= maxk): maxk, # <= maxk 1 42 
Approximately 5% of max_count met: minutes 8.333333e-05 
Approximately 10% of max_count met: minutes 0.0001166667 
Approximately 20% of max_count met: minutes 0.0001833333 
Approximately 33% of max_count met: minutes 0.00025 
Approximately 50% of max_count met: minutes 0.0003333333 
Approximately 75% of max_count met: minutes 0.0004833333 
Approximately 90% of max_count met: minutes 0.0005833333 
# of subgroups evaluated based on (up to) maxk-factor combinations 42 
% of all-possible combinations (<= maxk) 100 
k.max= 1 
Events criteria for control,exp= 10 10 
# of subgroups with events less than criteria: control, experimental 1 1 
# of subgroups with sample size less than criteria 1 
# of subgroups meeting all criteria = 40 
# of subgroups fitted (Cox model estimable) = 40 
*Subgroup Searching Minutes=* 0.0006166667 
Number of subgroups meeting HR threshold 10 
Subgroup candidate(s) found (FS) 
# of candidate subgroups (meeting HR criteria) =  10 
SGs (1st 10) meeting screening thresholds sorted by sg_focus= hr 
Consistency met! 
# of splits, Kmet= 1000 1 
Subgroup, % Consistency Met= {biomarker < 5} 1 
Consistency met! 
# of splits, Kmet= 1000 2 
Subgroup, % Consistency Met= {biomarker < 2} 0.998 
Consistency met! 
# of splits, Kmet= 1000 3 
Subgroup, % Consistency Met= {biomarker < 4} 1 
Consistency met! 
# of splits, Kmet= 1000 4 
Subgroup, % Consistency Met= {biomarker < 3} 1 
Consistency met! 
# of splits, Kmet= 1000 5 
Subgroup, % Consistency Met= {biomarker < 6} 1 
Consistency met! 
# of splits, Kmet= 1000 6 
Subgroup, % Consistency Met= {biomarker < 7} 1 
Consistency met! 
# of splits, Kmet= 1000 7 
Subgroup, % Consistency Met= {biomarker < 8} 1 
Consistency met! 
# of splits, Kmet= 1000 8 
Subgroup, % Consistency Met= {biomarker < 9} 1 
Kmet (found), Subgroup, % Consistency= 8 !{age <= 68} 0.75 
SG focus= hr 
Subgroup Consistency Minutes= 0.176 
Subgroup found (FS) 
Minutes forestsearch overall= 0.1812333 
Code
if(output) save(fs,file="output/sim3_k1tenhr.Rdata")

Display the first 10 subgroups and Kaplan-Meier plots within the estimated subgroups –

Notice that the first 10 subgroups are sorted by decreasing HR estimates and the selected subgroup corresponds to the highest hazard ratio estimate.

Code
if(loadit) load(file="output/sim3_k1tenhr.Rdata")
# Consistency for subgroups contained in "grp.consistency"
sg_tables(fs=fs,which_df="est",est_caption="Non-AP (training) estimates")$sg10_out
Subgroups formed by single-factors
maxk=1
M.1 N E hr Pcons
{biomarker < 5} 149 148 1.658330 1.000000
{biomarker < 4} 148 147 1.643559 1.000000
{biomarker < 3} 135 134 1.584584 1.000000
{biomarker < 6} 219 218 1.570647 1.000000
{biomarker < 7} 235 233 1.563982 1.000000
{biomarker < 8} 238 236 1.537168 1.000000
{biomarker < 9} 241 239 1.456963 1.000000
{biomarker < 2} 95 94 1.655311 0.998000
Code
#sg_tables(fs=fs,which_df="est",est_caption="Non-AP (training) estimates",potentialOutcome.name="loghr.po")$tab_estimates

# Show detailed estimates for the next analysis
id_harm <- paste(fs$sg.harm,collapse=" & ")

The selected subgroup is {biomarker < 5}

Next plot the K-M curves within the estimated non-AP (training) subgroups.

Code
plot_found.subgroups(fs=fs, df=fs$df.est, title_itt=c("Non-AP ITT"))

Finding Non-AP subgroup with smallest sample size meeting selection criteria

We now select the smallest subgroup meeting: log hazard-ratio estimate \(\log(\hat\beta) \geq log(0.90)\) with “consistency rate” at least \(90\)%.

Code
# sg_focus="MinSG" selects smallest subgroup meeting selection criteria

fs <- forestsearch(df.analysis=df_nonAP, df.test=df_AP, confounders.name=confounders.name,
outcome.name="y.sim",treat.name="treat.sim", event.name="event.sim",
potentialOutcome.name="loghr.po",
hr.threshold = 0.90, hr.consistency = 0.80, pconsistency.threshold = 0.90,
sg_focus = "minSG",
showten_subgroups=TRUE, details=TRUE,
conf_force=c("age <= 65","biomarker < 1","biomarker < 2","biomarker < 3",
"biomarker < 4","biomarker < 5","biomarker < 6","biomarker < 7","biomarker < 8",
"biomarker < 9","biomarker < 10"), exclude_cuts=c("biomarker <="),
maxk=1, plot.sg=TRUE)
# of continuous/categorical characteristics 2 7 
Continuous characteristics: age biomarker 
Categorical characteristics: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Default cuts included (1st 20) age <= mean(age) age <= median(age) age <= qlow(age) age <= qhigh(age) biomarker <= mean(biomarker) biomarker <= median(biomarker) biomarker <= qlow(biomarker) biomarker <= qhigh(biomarker) 
Categorical: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Dropping variables (only 1 level): AP_region 
# of candidate subgroup factors= 21 
 [1] "age <= 65"      "biomarker < 1"  "biomarker < 2"  "biomarker < 3" 
 [5] "biomarker < 4"  "biomarker < 5"  "biomarker < 6"  "biomarker < 7" 
 [9] "biomarker < 8"  "biomarker < 9"  "biomarker < 10" "age <= 59.6"   
[13] "age <= 61"      "age <= 52"      "age <= 68"      "male"          
[17] "EU_region"      "histology"      "surgery"        "ecog1"         
[21] "mAD_CTregimen1"
Number of factors evaluated= 21 
Confounders per grf screening q9 q8 q10 q11 q7 q21 q17 q18 q20 q13 q12 q14 q1 q3 q15 q6 q19 q4 q16 q5 q2 
          Factors Labels VI(grf)
9   biomarker < 8     q9  0.2154
8   biomarker < 7     q8  0.1573
10  biomarker < 9    q10  0.1192
11 biomarker < 10    q11  0.1111
7   biomarker < 6     q7  0.0779
21 mAD_CTregimen1    q21  0.0427
17      EU_region    q17  0.0349
18      histology    q18  0.0313
20          ecog1    q20  0.0300
13      age <= 61    q13  0.0283
12    age <= 59.6    q12  0.0242
14      age <= 52    q14  0.0203
1       age <= 65     q1  0.0180
3   biomarker < 2     q3  0.0167
15      age <= 68    q15  0.0146
6   biomarker < 5     q6  0.0130
19        surgery    q19  0.0117
4   biomarker < 3     q4  0.0112
16           male    q16  0.0111
5   biomarker < 4     q5  0.0110
2   biomarker < 1     q2  0.0000
Number of possible configurations (<= maxk): maxk, # <= maxk 1 42 
Approximately 5% of max_count met: minutes 3.333333e-05 
Approximately 10% of max_count met: minutes 6.666667e-05 
Approximately 20% of max_count met: minutes 0.00015 
Approximately 33% of max_count met: minutes 0.0002166667 
Approximately 50% of max_count met: minutes 3e-04 
Approximately 75% of max_count met: minutes 0.00045 
Approximately 90% of max_count met: minutes 0.0005333333 
# of subgroups evaluated based on (up to) maxk-factor combinations 42 
% of all-possible combinations (<= maxk) 100 
k.max= 1 
Events criteria for control,exp= 10 10 
# of subgroups with events less than criteria: control, experimental 1 1 
# of subgroups with sample size less than criteria 1 
# of subgroups meeting all criteria = 40 
# of subgroups fitted (Cox model estimable) = 40 
*Subgroup Searching Minutes=* 0.0005833333 
Number of subgroups meeting HR threshold 10 
Subgroup candidate(s) found (FS) 
# of candidate subgroups (meeting HR criteria) =  10 
SGs (1st 10) meeting screening thresholds sorted by sg_focus= minSG 
Kmet (found), Subgroup, % Consistency= 0 !{age <= 68} 0.75 
Consistency met! 
# of splits, Kmet= 1000 1 
Subgroup, % Consistency Met= {biomarker < 2} 0.998 
Consistency met! 
# of splits, Kmet= 1000 2 
Subgroup, % Consistency Met= {biomarker < 3} 1 
Consistency met! 
# of splits, Kmet= 1000 3 
Subgroup, % Consistency Met= {biomarker < 4} 1 
Consistency met! 
# of splits, Kmet= 1000 4 
Subgroup, % Consistency Met= {biomarker < 5} 1 
Consistency met! 
# of splits, Kmet= 1000 5 
Subgroup, % Consistency Met= {biomarker < 6} 1 
Consistency met! 
# of splits, Kmet= 1000 6 
Subgroup, % Consistency Met= {biomarker < 7} 1 
Consistency met! 
# of splits, Kmet= 1000 7 
Subgroup, % Consistency Met= {biomarker < 8} 1 
Consistency met! 
# of splits, Kmet= 1000 8 
Subgroup, % Consistency Met= {biomarker < 9} 1 
Number of subgroups meeting consistency criteria= 
   Pcons       hr     N     E      g      m     K             M.1
   <num>    <num> <num> <num> <char> <char> <num>          <char>
1: 0.998 1.655311    95    94      6      2     1 {biomarker < 2}
2: 1.000 1.584584   135   134      9      3     1 {biomarker < 3}
3: 1.000 1.643559   148   147     10      4     1 {biomarker < 4}
4: 1.000 1.658330   149   148      8      5     1 {biomarker < 5}
5: 1.000 1.570647   219   218      5      6     1 {biomarker < 6}
6: 1.000 1.563982   235   233      2      7     1 {biomarker < 7}
7: 1.000 1.537168   238   236      1      8     1 {biomarker < 8}
8: 1.000 1.456963   241   239      3      9     1 {biomarker < 9}

[1] "{biomarker < 2}"
% consistency criteria met= 0.998 
SG focus= minSG 
Subgroup Consistency Minutes= 0.1733167 
Subgroup found (FS) 
Minutes forestsearch overall= 0.1754667 
Code
if(output) save(fs,file="output/sim3_k1tenmin.Rdata")
Code
if(loadit) load(file="output/sim3_k1tenmin.Rdata")

# Training summaries
SG_est <- sg_tables(fs=fs,which_df="est",est_caption="Non-AP (training) estimates",potentialOutcome.name="loghr.po")

id_harm <- paste(fs$sg.harm,collapse=" & ")

Top 10 (up to 10) subgroups meeting criteria:

Code
# Top 10 (up to 10) SG's
SG_est$sg10_out
Subgroups formed by single-factors
maxk=1
M.1 N E hr Pcons
{biomarker < 2} 95 94 1.655311 0.998000
{biomarker < 3} 135 134 1.584584 1.000000
{biomarker < 4} 148 147 1.643559 1.000000
{biomarker < 5} 149 148 1.658330 1.000000
{biomarker < 6} 219 218 1.570647 1.000000
{biomarker < 7} 235 233 1.563982 1.000000
{biomarker < 8} 238 236 1.537168 1.000000
{biomarker < 9} 241 239 1.456963 1.000000

Estimates within training (un-adjusted)

Code
SG_est$tab_estimates
Non-AP (training) estimates
Subgroup n n1 events m1 m0 RMST HR (95% CI) AHR(po)
ITT 382 (100%) 189 (49%) 343 (90%) 11.8 10.7 4.72 (1.58,7.86) 0.68 (0.55,0.84) 0.74
Questionable 95 (25%) 46 (48%) 94 (99%) 6.2 8.2 -3.77 (-7.14,-0.39) 1.66 (1.09,2.52) 2.52
Recommend 287 (75%) 143 (50%) 249 (87%) 16 11.8 7.28 (3.68,10.87) 0.55 (0.42,0.71) 0.49

The selected subgroup is {biomarker < 2}

Next plot the K-M curves within the estimated non-AP subgroups.

Code
plot_found.subgroups(fs=fs, df=fs$df.est, title_itt=c("Non-AP ITT"))

Next, display the estimates within the AP population (testing data)

Code
# Training summaries
SG_test <- sg_tables(fs=fs,which_df="testing",est_caption="AP (testing) estimates",potentialOutcome.name="loghr.po")
Code
SG_test$tab_estimates
AP (testing) estimates
Subgroup n n1 events m1 m0 RMST HR (95% CI) AHR(po)
ITT 125 (100%) 63 (50%) 59 (47%) 49.3 NA 0.37 (-5.42,6.16) 1.08 (0.65,1.81) 0.73
Questionable 25 (20%) 15 (60%) 15 (60%) 26.3 NA -7.19 (-17.55,3.16) 3.43 (0.96,12.28) 2.52
Recommend 100 (80%) 48 (48%) 44 (44%) NA NA 3.02 (-2.54,8.59) 0.77 (0.42,1.39) 0.54

K-M curves applied to AP subgroups.

Code
plot_found.subgroups(fs=fs, df=fs$df.test, title_itt=c("AP ITT"))

Now, if we are only concerned with finding the selected subgroup then the above calculations can be much faster by removing the option to output the first 10 subgroups meeting the selection criteria (“showten_subgroups=FALSE” is the default). This is the default and strongly recommended if running simulations or if there is interest in calculating adjusted hazard-ratio estimates for the selected subgroup. However it is crucial to note that in our applications we are using the non-AP data for subgroup identification and interested in the estimates based on the AP local data. Therefore the non-AP data represents the training dataset and the AP data represents the testing dataset where such adjustment is not necessary (that is, the AP data is not utilized for the subgroup selection).

Here we turn-off showing the first ten subgroups (default in forestsearch) and apply the estimated subgroups to the AP data. Note that we now use the sg_focus=“minSG” criterion to select the smallest subgroup.

Code
# df.test = df_AP indicates that the selected subgroup identification flags 
# (based on df_nonAP analysis) will be appended to df_AP

# Speed is increased by first sorting by subgroup size and once a subgroup 
# meets the consistency criterion the overall criteria is met and the algorithm stops

fs <- forestsearch(df.analysis=df_nonAP, df.test=df_AP, 
confounders.name=confounders.name,
outcome.name="y.sim", treat.name="treat.sim", event.name="event.sim",
potentialOutcome.name="loghr.po",
hr.threshold = 0.90, hr.consistency = 0.80, pconsistency.threshold = 0.90,
sg_focus="minSG", details=TRUE,
conf_force=c("age <= 65","biomarker < 1","biomarker < 2","biomarker < 3",
"biomarker < 4","biomarker < 5","biomarker < 6","biomarker < 7","biomarker < 8",
"biomarker < 9","biomarker < 10"), exclude_cuts=c("biomarker <="), maxk=1)
# of continuous/categorical characteristics 2 7 
Continuous characteristics: age biomarker 
Categorical characteristics: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Default cuts included (1st 20) age <= mean(age) age <= median(age) age <= qlow(age) age <= qhigh(age) biomarker <= mean(biomarker) biomarker <= median(biomarker) biomarker <= qlow(biomarker) biomarker <= qhigh(biomarker) 
Categorical: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Dropping variables (only 1 level): AP_region 
# of candidate subgroup factors= 21 
 [1] "age <= 65"      "biomarker < 1"  "biomarker < 2"  "biomarker < 3" 
 [5] "biomarker < 4"  "biomarker < 5"  "biomarker < 6"  "biomarker < 7" 
 [9] "biomarker < 8"  "biomarker < 9"  "biomarker < 10" "age <= 59.6"   
[13] "age <= 61"      "age <= 52"      "age <= 68"      "male"          
[17] "EU_region"      "histology"      "surgery"        "ecog1"         
[21] "mAD_CTregimen1"
Number of factors evaluated= 21 
Confounders per grf screening q9 q8 q10 q11 q7 q21 q17 q18 q20 q13 q12 q14 q1 q3 q15 q6 q19 q4 q16 q5 q2 
          Factors Labels VI(grf)
9   biomarker < 8     q9  0.2154
8   biomarker < 7     q8  0.1573
10  biomarker < 9    q10  0.1192
11 biomarker < 10    q11  0.1111
7   biomarker < 6     q7  0.0779
21 mAD_CTregimen1    q21  0.0427
17      EU_region    q17  0.0349
18      histology    q18  0.0313
20          ecog1    q20  0.0300
13      age <= 61    q13  0.0283
12    age <= 59.6    q12  0.0242
14      age <= 52    q14  0.0203
1       age <= 65     q1  0.0180
3   biomarker < 2     q3  0.0167
15      age <= 68    q15  0.0146
6   biomarker < 5     q6  0.0130
19        surgery    q19  0.0117
4   biomarker < 3     q4  0.0112
16           male    q16  0.0111
5   biomarker < 4     q5  0.0110
2   biomarker < 1     q2  0.0000
Number of possible configurations (<= maxk): maxk, # <= maxk 1 42 
Approximately 5% of max_count met: minutes 5e-05 
Approximately 10% of max_count met: minutes 8.333333e-05 
Approximately 20% of max_count met: minutes 0.0002666667 
Approximately 33% of max_count met: minutes 0.0003333333 
Approximately 50% of max_count met: minutes 0.0004166667 
Approximately 75% of max_count met: minutes 0.0005666667 
Approximately 90% of max_count met: minutes 0.00065 
# of subgroups evaluated based on (up to) maxk-factor combinations 42 
% of all-possible combinations (<= maxk) 100 
k.max= 1 
Events criteria for control,exp= 10 10 
# of subgroups with events less than criteria: control, experimental 1 1 
# of subgroups with sample size less than criteria 1 
# of subgroups meeting all criteria = 40 
# of subgroups fitted (Cox model estimable) = 40 
*Subgroup Searching Minutes=* 0.0006833333 
Number of subgroups meeting HR threshold 10 
Subgroup candidate(s) found (FS) 
# of candidate subgroups (meeting HR criteria) =  10 
SGs (1st 10) meeting screening thresholds sorted by sg_focus= minSG 
Kmet (found), Subgroup, % Consistency= 0 !{age <= 68} 0.75 
Consistency met! 
# of splits, Kmet= 1000 1 
Subgroup, % Consistency Met= {biomarker < 2} 0.998 
SG focus= minSG 
Subgroup Consistency Minutes= 0.03761667 
Subgroup found (FS) 
Minutes forestsearch overall= 0.03965 
Code
if(output) save(fs,file="output/sim3_k1min.Rdata")

Apply estimated subgroup to the AP data —

Code
if(loadit) load(file="output/sim3_k1min.Rdata")

#sg_tables(fs=fs,which_df="est",est_caption="Non-AP (training) estimates",potentialOutcome.name="loghr.po")$tab_estimates

# Testing summaries
SG_test <- sg_tables(fs=fs,which_df="testing",est_caption="AP (testing) estimates",potentialOutcome.name="loghr.po")
Code
SG_test$tab_estimates
AP (testing) estimates
Subgroup n n1 events m1 m0 RMST HR (95% CI) AHR(po)
ITT 125 (100%) 63 (50%) 59 (47%) 49.3 NA 0.37 (-5.42,6.16) 1.08 (0.65,1.81) 0.73
Questionable 25 (20%) 15 (60%) 15 (60%) 26.3 NA -7.19 (-17.55,3.16) 3.43 (0.96,12.28) 2.52
Recommend 100 (80%) 48 (48%) 44 (44%) NA NA 3.02 (-2.54,8.59) 0.77 (0.42,1.39) 0.54

K-M curves applied to AP subgroups (repeat of above).

Code
plot_found.subgroups(fs=fs, df=fs$df.test, title_itt=c("AP ITT"))

Next, consider finding the largest subgroup with strong benefit

Here we are directly targeting the largest subgroup with a strong benefit. This is done by simply reversing the role of treatment so that we seek the largest subgroup for which the hazard ratio is below a clinical threshold (e.g., \(\leq 0.6\)).

Code
# sg_focus="MaxSG" selects smallest subgroup meeting selection criteria
# Reverse role of treatment so that we find "worst control" subgroup
dfnew <- copy(df_nonAP)
dfnew$control.sim <- 1-dfnew$treat.sim

dftest <- copy(df_AP)
dftest$control.sim <- 1-dftest$treat.sim

fs_k1_10_maxSG <- forestsearch(df.analysis=dfnew, df.test=dftest, confounders.name=confounders.name,
outcome.name="y.sim",treat.name="control.sim", event.name="event.sim",
potentialOutcome.name="loghr.po",
est.scale="1/hr", hr.threshold = 1/0.6, hr.consistency = 1/0.7, pconsistency.threshold = 0.90,
sg_focus = "maxSG",
showten_subgroups=TRUE, details=TRUE,
conf_force=c("age <= 65","biomarker < 1","biomarker < 2","biomarker < 3",
"biomarker < 4","biomarker < 5","biomarker < 6","biomarker < 7","biomarker < 8",
"biomarker < 9","biomarker < 10"), exclude_cuts=c("biomarker <="),
maxk=1, plot.sg=TRUE)
# of continuous/categorical characteristics 2 7 
Continuous characteristics: age biomarker 
Categorical characteristics: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Default cuts included (1st 20) age <= mean(age) age <= median(age) age <= qlow(age) age <= qhigh(age) biomarker <= mean(biomarker) biomarker <= median(biomarker) biomarker <= qlow(biomarker) biomarker <= qhigh(biomarker) 
Categorical: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Dropping variables (only 1 level): AP_region 
# of candidate subgroup factors= 21 
 [1] "age <= 65"      "biomarker < 1"  "biomarker < 2"  "biomarker < 3" 
 [5] "biomarker < 4"  "biomarker < 5"  "biomarker < 6"  "biomarker < 7" 
 [9] "biomarker < 8"  "biomarker < 9"  "biomarker < 10" "age <= 59.6"   
[13] "age <= 61"      "age <= 52"      "age <= 68"      "male"          
[17] "EU_region"      "histology"      "surgery"        "ecog1"         
[21] "mAD_CTregimen1"
Number of factors evaluated= 21 
Confounders per grf screening q9 q8 q10 q11 q7 q21 q17 q18 q20 q13 q12 q14 q1 q3 q15 q6 q19 q4 q16 q5 q2 
          Factors Labels VI(grf)
9   biomarker < 8     q9  0.2150
8   biomarker < 7     q8  0.1574
10  biomarker < 9    q10  0.1192
11 biomarker < 10    q11  0.1122
7   biomarker < 6     q7  0.0765
21 mAD_CTregimen1    q21  0.0430
17      EU_region    q17  0.0347
18      histology    q18  0.0314
20          ecog1    q20  0.0299
13      age <= 61    q13  0.0284
12    age <= 59.6    q12  0.0242
14      age <= 52    q14  0.0206
1       age <= 65     q1  0.0182
3   biomarker < 2     q3  0.0167
15      age <= 68    q15  0.0146
6   biomarker < 5     q6  0.0131
19        surgery    q19  0.0116
4   biomarker < 3     q4  0.0111
16           male    q16  0.0111
5   biomarker < 4     q5  0.0110
2   biomarker < 1     q2  0.0000
Number of possible configurations (<= maxk): maxk, # <= maxk 1 42 
Approximately 5% of max_count met: minutes 3.333333e-05 
Approximately 10% of max_count met: minutes 6.666667e-05 
Approximately 20% of max_count met: minutes 0.0001333333 
Approximately 33% of max_count met: minutes 2e-04 
Approximately 50% of max_count met: minutes 0.0002833333 
Approximately 75% of max_count met: minutes 0.0004166667 
Approximately 90% of max_count met: minutes 5e-04 
# of subgroups evaluated based on (up to) maxk-factor combinations 42 
% of all-possible combinations (<= maxk) 100 
k.max= 1 
Events criteria for control,exp= 10 10 
# of subgroups with events less than criteria: control, experimental 1 1 
# of subgroups with sample size less than criteria 1 
# of subgroups meeting all criteria = 40 
# of subgroups fitted (Cox model estimable) = 40 
*Subgroup Searching Minutes=* 0.00065 
Number of subgroups meeting HR threshold 10 
Subgroup candidate(s) found (FS) 
# of candidate subgroups (meeting HR criteria) =  10 
SGs (1st 10) meeting screening thresholds sorted by sg_focus= maxSG 
Consistency met! 
# of splits, Kmet= 1000 1 
Subgroup, % Consistency Met= !{biomarker < 2} 0.951 
Consistency met! 
# of splits, Kmet= 1000 2 
Subgroup, % Consistency Met= !{biomarker < 3} 0.998 
Consistency met! 
# of splits, Kmet= 1000 3 
Subgroup, % Consistency Met= !{biomarker < 4} 0.999 
Consistency met! 
# of splits, Kmet= 1000 4 
Subgroup, % Consistency Met= !{biomarker < 5} 0.998 
Consistency met! 
# of splits, Kmet= 1000 5 
Subgroup, % Consistency Met= !{biomarker < 6} 1 
Consistency met! 
# of splits, Kmet= 1000 6 
Subgroup, % Consistency Met= !{biomarker < 7} 1 
Consistency met! 
# of splits, Kmet= 1000 7 
Subgroup, % Consistency Met= !{biomarker < 8} 1 
Consistency met! 
# of splits, Kmet= 1000 8 
Subgroup, % Consistency Met= !{biomarker < 9} 1 
Kmet (found), Subgroup, % Consistency= 8 !{male} 0.597 
Number of subgroups meeting consistency criteria= 
   Pcons       hr     N     E      g      m     K              M.1
   <num>    <num> <num> <num> <char> <char> <num>           <char>
1: 0.951 1.824302   287   249      6      1     1 !{biomarker < 2}
2: 0.998 2.136984   247   209      8      2     1 !{biomarker < 3}
3: 0.999 2.287431   234   196     10      3     1 !{biomarker < 4}
4: 0.998 2.321638   233   195      7      4     1 !{biomarker < 5}
5: 1.000 3.290612   163   125      5      5     1 !{biomarker < 6}
6: 1.000 4.212110   147   110      2      6     1 !{biomarker < 7}
7: 1.000 4.546661   144   107      1      7     1 !{biomarker < 8}
8: 1.000 4.748353   141   104      3      8     1 !{biomarker < 9}

[1] "!{biomarker < 2}"
% consistency criteria met= 0.951 
SG focus= maxSG 
Subgroup Consistency Minutes= 0.1734167 
Subgroup found (FS) 
Minutes forestsearch overall= 0.17545 
Code
if(output) save(fs_k1_10_maxSG,file="output/sim3_k1tenmax.Rdata")
Code
if(loadit) load(file="output/sim3_k1tenmax.Rdata")
fs <- fs_k1_10_maxSG
# Testing summaries
SG_est <- sg_tables(fs=fs,which_df="est",est_caption="Non-AP (training) estimates",potentialOutcome.name="loghr.po")

Top 10 (up to 10) subgroups meeting criteria:

Code
# Top 10 (up to 10) SG's
SG_est$sg10_out
Subgroups formed by single-factors
maxk=1
M.1 N E hr Pcons
!{biomarker < 2} 287 249 0.548155 0.951000
!{biomarker < 3} 247 209 0.467949 0.998000
!{biomarker < 4} 234 196 0.437172 0.999000
!{biomarker < 5} 233 195 0.430730 0.998000
!{biomarker < 6} 163 125 0.303895 1.000000
!{biomarker < 7} 147 110 0.237411 1.000000
!{biomarker < 8} 144 107 0.219942 1.000000
!{biomarker < 9} 141 104 0.210599 1.000000

Estimates within training (un-adjusted)

Code
SG_est$tab_estimates
Non-AP (training) estimates
Subgroup n n1 events m1 m0 RMST HR (95% CI) AHR(po)
ITT 382 (100%) 189 (49%) 343 (90%) 11.8 10.7 4.72 (1.58,7.86) 0.68 (0.55,0.84) 0.74
Questionable 95 (25%) 46 (48%) 94 (99%) 6.2 8.2 -3.77 (-7.14,-0.39) 1.66 (1.09,2.52) 2.52
Recommend 287 (75%) 143 (50%) 249 (87%) 16 11.8 7.28 (3.68,10.87) 0.55 (0.42,0.71) 0.49

The selected subgroup is {biomarker < 2}

Next plot the K-M curves within the estimated non-AP subgroups.

Code
plot_found.subgroups(fs=fs, df=fs$df.est, title_itt=c("Non-AP ITT"))

Next, display the estimates within the AP population (testing data)

Code
# Training summaries
SG_test <- sg_tables(fs=fs,which_df="testing",est_caption="AP (testing) estimates",potentialOutcome.name="loghr.po")
Code
SG_test$tab_estimates
AP (testing) estimates
Subgroup n n1 events m1 m0 RMST HR (95% CI) AHR(po)
ITT 125 (100%) 63 (50%) 59 (47%) 49.3 NA 0.37 (-5.42,6.16) 1.08 (0.65,1.81) 0.73
Questionable 25 (20%) 15 (60%) 15 (60%) 26.3 NA -7.19 (-17.55,3.16) 3.43 (0.96,12.28) 2.52
Recommend 100 (80%) 48 (48%) 44 (44%) NA NA 3.02 (-2.54,8.59) 0.77 (0.42,1.39) 0.54

K-M curves applied to AP subgroups.

Code
plot_found.subgroups(fs=fs, df=fs$df.test, title_itt=c("AP ITT"))

What happens under the (“null”) where treatment is uniform?

Code
# Null uniform benefit hr=0.7
log.hrs <- log(c(0.7,0.7,0.7))

dgm <- get_dgm_stratified(df=df.case,zeta=10,knot=5,log.hrs=log.hrs,strata_tte=NULL,details=TRUE)

Code
df_null <- draw_sim_stratified(dgm=dgm,ss=1,wname="AP_region",bw=-log(5),strata_rand="stratum",checking=FALSE,details=FALSE,return_df=TRUE)

fs <- forestsearch(df.analysis=subset(df_null,AP_region==0),  
confounders.name=confounders.name,
outcome.name="y.sim", treat.name="treat.sim", event.name="event.sim",
potentialOutcome.name="loghr.po",
hr.threshold = 0.90, hr.consistency = 0.80, pconsistency.threshold = 0.90,
sg_focus="minSG", details=TRUE,
conf_force=c("age <= 65","biomarker < 1","biomarker < 2","biomarker < 3",
"biomarker < 4","biomarker < 5","biomarker < 6","biomarker < 7","biomarker < 8",
"biomarker < 9","biomarker < 10"), exclude_cuts=c("biomarker <="), maxk=1)
# of continuous/categorical characteristics 2 7 
Continuous characteristics: age biomarker 
Categorical characteristics: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Default cuts included (1st 20) age <= mean(age) age <= median(age) age <= qlow(age) age <= qhigh(age) biomarker <= mean(biomarker) biomarker <= median(biomarker) biomarker <= qlow(biomarker) biomarker <= qhigh(biomarker) 
Categorical: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Dropping variables (only 1 level): AP_region 
# of candidate subgroup factors= 21 
 [1] "age <= 65"      "biomarker < 1"  "biomarker < 2"  "biomarker < 3" 
 [5] "biomarker < 4"  "biomarker < 5"  "biomarker < 6"  "biomarker < 7" 
 [9] "biomarker < 8"  "biomarker < 9"  "biomarker < 10" "age <= 59.6"   
[13] "age <= 61"      "age <= 52"      "age <= 68"      "male"          
[17] "EU_region"      "histology"      "surgery"        "ecog1"         
[21] "mAD_CTregimen1"
Number of factors evaluated= 21 
Confounders per grf screening q16 q14 q20 q21 q19 q17 q18 q13 q3 q1 q5 q15 q7 q12 q4 q6 q8 q11 q9 q10 q2 
          Factors Labels VI(grf)
16           male    q16  0.1656
14      age <= 52    q14  0.1012
20          ecog1    q20  0.0963
21 mAD_CTregimen1    q21  0.0769
19        surgery    q19  0.0600
17      EU_region    q17  0.0594
18      histology    q18  0.0594
13      age <= 61    q13  0.0546
3   biomarker < 2     q3  0.0512
1       age <= 65     q1  0.0474
5   biomarker < 4     q5  0.0349
15      age <= 68    q15  0.0344
7   biomarker < 6     q7  0.0312
12    age <= 59.6    q12  0.0293
4   biomarker < 3     q4  0.0269
6   biomarker < 5     q6  0.0261
8   biomarker < 7     q8  0.0162
11 biomarker < 10    q11  0.0101
9   biomarker < 8     q9  0.0097
10  biomarker < 9    q10  0.0091
2   biomarker < 1     q2  0.0000
Number of possible configurations (<= maxk): maxk, # <= maxk 1 42 
Approximately 5% of max_count met: minutes 1.666667e-05 
Approximately 10% of max_count met: minutes 5e-05 
Approximately 20% of max_count met: minutes 1e-04 
Approximately 33% of max_count met: minutes 0.0001666667 
Approximately 50% of max_count met: minutes 0.0002333333 
Approximately 75% of max_count met: minutes 0.0003666667 
Approximately 90% of max_count met: minutes 0.0004333333 
# of subgroups evaluated based on (up to) maxk-factor combinations 42 
% of all-possible combinations (<= maxk) 100 
k.max= 1 
Events criteria for control,exp= 10 10 
# of subgroups with events less than criteria: control, experimental 1 1 
# of subgroups with sample size less than criteria 1 
# of subgroups meeting all criteria = 40 
# of subgroups fitted (Cox model estimable) = 40 
*Subgroup Searching Minutes=* 0.0004666667 
Number of subgroups meeting HR threshold 1 
Subgroup candidate(s) found (FS) 
# of candidate subgroups (meeting HR criteria) =  1 
SGs (1st 10) meeting screening thresholds sorted by sg_focus= minSG 
Kmet (found), Subgroup, % Consistency= 0 {surgery} 0.547 
Subgroup Consistency Minutes= 0.01845 
NO subgroup found (FS) 
Minutes forestsearch overall= 0.02036667 
Code
if(output) save(fs, file="output/sim3_k1min_null.Rdata")
Code
if(loadit) load("output/sim3_k1min_null.Rdata")
if(is.null(fs$sg.harm)) cat("Subgroup NOT found","\n")
Subgroup NOT found 

Next, let’s look at selecting smallest subgroup among 2-factor combinations (maxk=2)

Code
# sg_focus="MinSG" selects smallest subgroup meeting selection criteria

fs <- forestsearch(df.analysis=df_nonAP, confounders.name=confounders.name,
outcome.name="y.sim",treat.name="treat.sim", event.name="event.sim",
potentialOutcome.name="loghr.po",
hr.threshold = 0.90, hr.consistency = 0.80, pconsistency.threshold = 0.90,
sg_focus = "minSG",
showten_subgroups=TRUE, details=TRUE,
conf_force=c("age <= 65","biomarker < 1","biomarker < 2","biomarker < 3",
"biomarker < 4","biomarker < 5","biomarker < 6","biomarker < 7","biomarker < 8",
"biomarker < 9","biomarker < 10"), exclude_cuts=c("biomarker <="),
maxk=2, plot.sg=TRUE)
# of continuous/categorical characteristics 2 7 
Continuous characteristics: age biomarker 
Categorical characteristics: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Default cuts included (1st 20) age <= mean(age) age <= median(age) age <= qlow(age) age <= qhigh(age) biomarker <= mean(biomarker) biomarker <= median(biomarker) biomarker <= qlow(biomarker) biomarker <= qhigh(biomarker) 
Categorical: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Dropping variables (only 1 level): AP_region 
# of candidate subgroup factors= 21 
 [1] "age <= 65"      "biomarker < 1"  "biomarker < 2"  "biomarker < 3" 
 [5] "biomarker < 4"  "biomarker < 5"  "biomarker < 6"  "biomarker < 7" 
 [9] "biomarker < 8"  "biomarker < 9"  "biomarker < 10" "age <= 59.6"   
[13] "age <= 61"      "age <= 52"      "age <= 68"      "male"          
[17] "EU_region"      "histology"      "surgery"        "ecog1"         
[21] "mAD_CTregimen1"
Number of factors evaluated= 21 
Confounders per grf screening q9 q8 q10 q11 q7 q21 q17 q18 q20 q13 q12 q14 q1 q3 q15 q6 q19 q4 q16 q5 q2 
          Factors Labels VI(grf)
9   biomarker < 8     q9  0.2154
8   biomarker < 7     q8  0.1573
10  biomarker < 9    q10  0.1192
11 biomarker < 10    q11  0.1111
7   biomarker < 6     q7  0.0779
21 mAD_CTregimen1    q21  0.0427
17      EU_region    q17  0.0349
18      histology    q18  0.0313
20          ecog1    q20  0.0300
13      age <= 61    q13  0.0283
12    age <= 59.6    q12  0.0242
14      age <= 52    q14  0.0203
1       age <= 65     q1  0.0180
3   biomarker < 2     q3  0.0167
15      age <= 68    q15  0.0146
6   biomarker < 5     q6  0.0130
19        surgery    q19  0.0117
4   biomarker < 3     q4  0.0112
16           male    q16  0.0111
5   biomarker < 4     q5  0.0110
2   biomarker < 1     q2  0.0000
Number of possible configurations (<= maxk): maxk, # <= maxk 2 903 
Approximately 5% of max_count met: minutes 0.0005833333 
Approximately 10% of max_count met: minutes 0.00075 
Approximately 20% of max_count met: minutes 0.00195 
Approximately 33% of max_count met: minutes 0.0033 
Approximately 50% of max_count met: minutes 0.005116667 
Approximately 75% of max_count met: minutes 0.007216667 
Approximately 90% of max_count met: minutes 0.0083 
# of subgroups evaluated based on (up to) maxk-factor combinations 903 
% of all-possible combinations (<= maxk) 100 
k.max= 2 
Events criteria for control,exp= 10 10 
# of subgroups with events less than criteria: control, experimental 124 128 
# of subgroups with sample size less than criteria 278 
# of subgroups meeting all criteria = 533 
# of subgroups fitted (Cox model estimable) = 533 
*Subgroup Searching Minutes=* 0.008866667 
Number of subgroups meeting HR threshold 212 
Subgroup candidate(s) found (FS) 
# of candidate subgroups (meeting HR criteria) =  212 
SGs (1st 10) meeting screening thresholds sorted by sg_focus= minSG 
Consistency met! 
# of splits, Kmet= 1000 1 
Subgroup, % Consistency Met= {biomarker < 6} {age <= 52} 0.996 
Kmet (found), Subgroup, % Consistency= 1 !{age <= 68} !{surgery} 0.581 
Kmet (found), Subgroup, % Consistency= 1 {ecog1} {age <= 52} 0.455 
Consistency met! 
# of splits, Kmet= 1000 2 
Subgroup, % Consistency Met= {age <= 59.6} {biomarker < 4} 0.978 
Consistency met! 
# of splits, Kmet= 1000 3 
Subgroup, % Consistency Met= {biomarker < 7} {age <= 52} 0.995 
Consistency met! 
# of splits, Kmet= 1000 4 
Subgroup, % Consistency Met= {age <= 59.6} {biomarker < 5} 0.983 
Kmet (found), Subgroup, % Consistency= 4 !{age <= 68} {male} 0.766 
Kmet (found), Subgroup, % Consistency= 4 {age <= 52} {male} 0.664 
Consistency met! 
# of splits, Kmet= 1000 5 
Subgroup, % Consistency Met= {EU_region} {biomarker < 5} 0.985 
Consistency met! 
# of splits, Kmet= 1000 6 
Subgroup, % Consistency Met= {biomarker < 8} {age <= 52} 0.997 
Consistency met! 
# of splits, Kmet= 1000 7 
Subgroup, % Consistency Met= {age <= 65} {biomarker < 2} 0.997 
Consistency met! 
# of splits, Kmet= 1000 8 
Subgroup, % Consistency Met= {mAD_CTregimen1} {biomarker < 3} 0.915 
Consistency met! 
# of splits, Kmet= 1000 9 
Subgroup, % Consistency Met= {age <= 61} {biomarker < 3} 0.997 
Consistency met! 
# of splits, Kmet= 1000 10 
Subgroup, % Consistency Met= {biomarker < 9} {age <= 52} 0.996 
Number of subgroups meeting consistency criteria= 
    Pcons       hr     N     E      g      m     K              M.1
    <num>    <num> <num> <num> <char> <char> <num>           <char>
 1: 0.996 1.968404    61    61    136      1     2  {biomarker < 6}
 2: 0.978 1.463912    63    63    185      4     2    {age <= 59.6}
 3: 0.995 1.972295    64    64     53      5     2  {biomarker < 7}
 4: 0.983 1.500235    64    64    184      6     2    {age <= 59.6}
 5: 0.985 1.490210    65    64    161      9     2      {EU_region}
 6: 0.997 1.969458    66    66     25     10     2  {biomarker < 8}
 7: 0.997 1.901186    66    65    193     11     2      {age <= 65}
 8: 0.915 1.227319    66    66    156     12     2 {mAD_CTregimen1}
 9: 0.997 1.855252    67    66    178     13     2      {age <= 61}
10: 0.996 1.709685    67    67     81     14     2  {biomarker < 9}
                M.2
             <char>
 1:     {age <= 52}
 2: {biomarker < 4}
 3:     {age <= 52}
 4: {biomarker < 5}
 5: {biomarker < 5}
 6:     {age <= 52}
 7: {biomarker < 2}
 8: {biomarker < 3}
 9: {biomarker < 3}
10:     {age <= 52}

[1] "{biomarker < 6}" "{age <= 52}"    
% consistency criteria met= 0.996 
SG focus= minSG 
Subgroup Consistency Minutes= 0.25085 
Subgroup found (FS) 
Minutes forestsearch overall= 0.2611 
Code
if(output) save(fs,file="output/sim3_k2tenmin.Rdata")
Code
if(loadit) load(file="output/sim3_k2tenmin.Rdata")
# Testing summaries
SG_est <- sg_tables(fs=fs,which_df="est",est_caption="Non-AP (training) estimates",potentialOutcome.name="loghr.po")

Top 10 (up to 10) subgroups meeting criteria:

Code
# Top 10 (up to 10) SG's
SG_est$sg10_out
Subgroups formed by two-factors
maxk=2
M.1 M.2 N E hr Pcons
{biomarker < 6} {age <= 52} 61 61 1.968404 0.996000
{age <= 59.6} {biomarker < 4} 63 63 1.463912 0.978000
{biomarker < 7} {age <= 52} 64 64 1.972295 0.995000
{age <= 59.6} {biomarker < 5} 64 64 1.500235 0.983000
{EU_region} {biomarker < 5} 65 64 1.490210 0.985000
{biomarker < 8} {age <= 52} 66 66 1.969458 0.997000
{age <= 65} {biomarker < 2} 66 65 1.901186 0.997000
{mAD_CTregimen1} {biomarker < 3} 66 66 1.227319 0.915000
{age <= 61} {biomarker < 3} 67 66 1.855252 0.997000
{biomarker < 9} {age <= 52} 67 67 1.709685 0.996000

Simulations

Next we setup simulations to evaluate ability to identify subgroups based on non-AP data and the accuracy to estimate treatment effects in the AP data.

The simulation function mrct_APregion_sims simulates (n_sims times) from specified data-generating-model (dgm):

  • ITT sample is drawn with subgroups identified (as above)
  • Cox estimates based on the AP local data;
  • Subgroups identified based on non-AP and applied to AP
  • For identified subgroups the Cox model estimates are created (hr_sg)
    • Note that if a subgroup is NOT found then “subgroups” are taken as the entire AP population
    • That is, if a subgroup is NOT identified then we interpret the hr_sg as the reported AP estimates which would be the overall AP (since no subgroup is found)
    • In tables below we also summarize hr_sg_null where we define hr_sg to be missing if no subgroup is found
      • This represents the AP subgroup estimates conditional on a subgroup being identified
Code
# Simulation function to automate
mrct_APregion_sims <- function(dgm,n_sims,wname="AP_region",bw=-log(5),sg_focus="minSG",maxk=1, 
hr.threshold=0.90, hr.consistency=0.80, pconsistency.threshold=0.9,details=FALSE){
t.start<-proc.time()[3]
  # message for backends
  # borrowed from simtrials (sim_fixed_n)
  if (!is(plan(), "sequential")) {
    # future backend
    message("Using ", nbrOfWorkers(), " cores with backend ", attr(plan("list")[[1]], "class")[2])
  } else if (foreach::getDoParWorkers() > 1) {
    message("Using ", foreach::getDoParWorkers(), " cores with backend ", foreach::getDoParName())
    message("Warning: ")
    message("doFuture may exhibit suboptimal performance when using a doParallel backend.")
  } else {
    message("Backend uses sequential processing.")
  }
results <- foreach(
    sim = seq_len(n_sims),
    .options.future=list(seed=TRUE),
    .combine="rbind", 
    .errorhandling="pass"
  ) %dofuture% {
# simulate data
dfs <- draw_sim_stratified(dgm=dgm,ss=sim,wname=wname,bw=bw,strata_rand="stratum")
# Subgroups identified with nonAP population
df_nonAP <- subset(dfs,AP_region==0)
# Applied to AP
df_AP <- subset(dfs,AP_region==1)
# For subgroup identified estimates     
# Return NA if not estimable
# First, AP results (does not require subgroup identification)
# Initialize to missing
n_test <- c(nrow(df_AP))
n_train <- c(nrow(df_nonAP))
n_itt <- c(nrow(dfs))

fit <- summary(coxph(Surv(y.sim,event.sim) ~ treat.sim, data=df_nonAP))$conf.int
hr_train <- c(fit[1])
rm("fit")

# Overall (ITT) standard stratified (by randomization strata)
fit <- summary(coxph(Surv(y.sim,event.sim) ~ treat.sim + strata(strata.simR), data=dfs))$conf.int
hr_itt <- c(fit[1])
rm("fit")

# Overall (ITT) standard stratified (by X)
fit <- summary(coxph(Surv(y.sim,event.sim) ~ treat.sim + strata(AP_region), data=dfs))$conf.int
hr_ittX <- c(fit[1])
rm("fit")


hr_test<- NA
any_found <- 0.0
sg_found <- "none"
n_sg <- n_test
hr_sg <- NA
prev_sg <- 1.0

# Restrict to AP estimable
fitAP <- try(summary(coxph(Surv(y.sim,event.sim) ~ treat.sim, data=df_AP))$conf.int,TRUE)
if(!inherits(fitAP,"try-error")){
hr_test <- c(fitAP[1])
rm("fitAP")
# For identified subgroups?
# initialize to testing
any_found <- 0.0 
sg_found <- "none" 
n_sg <- n_test 
prev_sg <- 1.0 # Prevalence of AP: n_sg/n_all
hr_sg <- NA
# Potential outcome
POhr_sg <- exp(mean(df_AP$loghr.po))
# Note: if not found then n_sg=n_all, hr_sg=hr_all 
fs_temp <- try(forestsearch(df.analysis=df_nonAP, df.test=df_AP, 
confounders.name=confounders.name,
outcome.name="y.sim", treat.name="treat.sim", event.name="event.sim",
potentialOutcome.name="loghr.po",
hr.threshold = hr.threshold, hr.consistency = hr.consistency, 
pconsistency.threshold = pconsistency.threshold,
sg_focus=sg_focus, 
conf_force=c("age <= 65","biomarker < 1","biomarker < 2","biomarker < 3",
"biomarker < 4","biomarker < 5","biomarker < 6","biomarker < 7","biomarker < 8",
"biomarker < 9","biomarker < 10"), exclude_cuts =c("biomarker <="), maxk=maxk),TRUE)
# FS with error output NA's (set above)
# FS without error
if(!inherits(fs_temp,"try-error")){
# FS done (w/o error) but NO sg found (replace NAs above with what is calculated)
if(is.null(fs_temp$sg.harm)){
# consider "sg = AP"  
any_found <- 0.0
sg_found <- "none"
n_sg <- n_test
hr_sg <- hr_test
POhr_sg <- exp(mean(df_AP$loghr.po))
prev_sg <- 1.0
}
# If sg found
if(!is.null(fs_temp$sg.harm)){
any_found <- 1.0  
df_test <- fs_temp$df.test 
sg_found <- c(paste(fs_temp$sg.harm,collapse=" & "))
if(details & sim <= 10) cat("Simulation subgroup found:",c(sim,sg_found),"\n")  
df_sg <- subset(df_test,treat.recommend==1)
n_sg <- c(nrow(df_sg))
prev_sg <- c(n_sg/n_test)
# Subgroup analysis
fitSG <- try(summary(coxph(Surv(y.sim,event.sim)~treat.sim,data=df_sg))$conf.int,TRUE)
if(!inherits(fitSG,"try_error") & is.numeric(fitSG[1])) hr_sg <- c(fitSG[1])
if(inherits(fitSG,"try_error") | !is.numeric(fitSG[1])) hr_sg <- NA
loghr.po <- df_sg$loghr.po
POhr_sg <- exp(mean(loghr.po))
rm("fitSG")
}
 }  # FS without error done
rm("fs_temp")
} # AP estimable in first place
# Results
dfres <- data.table::data.table(n_itt,hr_itt,hr_ittX,n_train,hr_train,n_test,hr_test,any_found,sg_found,n_sg,hr_sg,POhr_sg,prev_sg)
return(dfres)
  }
  t.now<-proc.time()[3]
  t.min<-(t.now-t.start)/60
  if(details){
    cat("Minutes for simulations",c(n_sims,t.min),"\n")
    cat("Projection per 1000",c(t.min*(1000/n_sims)),"\n")
    cat("Propn subgroups found =",c(sum(!is.na(results$any_found))/n_sims),"\n")
  }
# Define hr_sg_null as missing if any_found = 0
results$hr_sg_null <- results$hr_sg
results$hr_sg_null[results$any_found==0] <- NA
return(results)  
}

summaryout_mrct <- function(pop_summary,mrct_sims, 
out_sgs = c("sg_found","sg_biomarker","sg_age"),
out_sgs2 = c("sg_biomarker","sg_age","sg_male","sg_ecog1","sg_histology","sg_CTregimen","sg_EU","sg_surgery"),
out_est = c("+ APflag + sg_le85 + APflag2 + APflag3 + hr_sg_null"),
sg_type=1,
tab_caption = c("Identified subgroups and estimation summaries under biomarker effects: 1-factor combinations"), showtable=TRUE){

outwhat1 <- c("~ hr_itt + hr_ittX + hr_test + found + prev_sg + hr_sg + POhr_sg +")
if(sg_type==1) outwhat2 <- paste(out_sgs, collapse="+")
if(sg_type==2) outwhat2 <- paste(out_sgs2, collapse="+")
out_what <- as.formula(paste(outwhat1,outwhat2,out_est))

res <- list()
# True causal AHR
res$ahr_true = c(round(pop_summary$AHR,4))
# AHR Cox un-adjusted (only treatment)
res$ahr_unadj = c(round(pop_summary$ITT_unadj,4))
# Treatment plus randomization stratificaton sR
res$ahr_sR = c(round(pop_summary$ITT_sR,4))
# adjustment by w
res$ahr_sRw = c(round(pop_summary$ITT_sRw,4))
# Relative biases 
res$bias_unadj <-round(100*c(pop_summary$ITT_unadj-pop_summary$AHR)/pop_summary$AHR,1)
res$bias_sR <-round(100*c(pop_summary$ITT_sR-pop_summary$AHR)/pop_summary$AHR,1)
res$bias_sRw <-round(100*c(pop_summary$ITT_sRw-pop_summary$AHR)/pop_summary$AHR,1)
# The bias within W=1 population
res$ahr_w1 = c(round(pop_summary$W_1,4))
res$bias_w1 <-round(100*c(pop_summary$W_1-pop_summary$AHR)/pop_summary$AHR,1)

mrct_sims$APflag <- ifelse(mrct_sims$hr_test > 0.9,"AP > 0.9","AP <=0.9")
mrct_sims$sg_le85 <- ifelse(mrct_sims$hr_sg <=0.85,"AP(sg)<=0.85","AP(sg)>0.85")

mrct_sims$APflag2 <- ifelse(mrct_sims$hr_test > 0.9 & mrct_sims$hr_sg <= 0.85,"AP > 0.9 & AP(sg) <= 0.85","Not")
mrct_sims$APflag3 <- ifelse(mrct_sims$hr_test > 0.9 & mrct_sims$hr_sg <= 0.80,"AP > 0.9 & AP(sg) <= 0.80","Not")

mrct_sims$found <- as.factor(mrct_sims$any_found)

# Classify by specific factors identified
mrct_sims$sg_biomarker <- ifelse(grepl("biomarker",mrct_sims$sg_found),
"biomarker",ifelse(mrct_sims$sg_found!="none","other","none"))

mrct_sims$sg_age <- ifelse(grepl("age",mrct_sims$sg_found),
"age",ifelse(mrct_sims$sg_found!="none","other","none"))

mrct_sims$sg_ecog1 <- ifelse(grepl("ecog1",mrct_sims$sg_found),
"ecog1",ifelse(mrct_sims$sg_found!="none","other","none"))

mrct_sims$sg_histology <- ifelse(grepl("histology",mrct_sims$sg_found),
"histology",ifelse(mrct_sims$sg_found!="none","other","none"))

mrct_sims$sg_CTregimen <- ifelse(grepl("mAD_CTregimen1",mrct_sims$sg_found),
"mAD_CT",ifelse(mrct_sims$sg_found!="none","other","none"))

mrct_sims$sg_male <- ifelse(grepl("male",mrct_sims$sg_found),
"male",ifelse(mrct_sims$sg_found!="none","other","none"))

mrct_sims$sg_surgery <- ifelse(grepl("surgery",mrct_sims$sg_found),
"surgery",ifelse(mrct_sims$sg_found!="none","other","none"))

mrct_sims$sg_EU <- ifelse(grepl("EU_region",mrct_sims$sg_found),
"EU region",ifelse(mrct_sims$sg_found!="none","other","none"))

out_table <- table1(out_what, data=mrct_sims, caption=tab_caption)

if(showtable) print(out_table)

return(list(res=res,out_table=out_table))
}

SGplot_estimates <- function(df,label_training="Training",label_testing="Testing", label_itt="ITT (sR)", label_sg="Testing(subgroup)"){
df_itt <- data.table(est=df$hr_itt, analysis=label_itt)
df_training <- data.table(est=df$hr_train, analysis=label_training)
df_testing <- data.table(est=df$hr_test, analysis=label_testing)
df_sg <- data.table(est=df$hr_sg, analysis=label_sg)

hr_estimates <- rbind(df_itt, df_training, df_testing, df_sg)  
est_order <- c(label_itt,label_training,label_testing,label_sg)
hr_estimates <- within(hr_estimates, {analysis <- factor(analysis,levels=est_order)})
p <- ggplot(hr_estimates, aes(x=analysis, y=est, fill=analysis)) + 
  geom_violin(trim=FALSE)
plot_estimates <- p + geom_boxplot(width=0.1)
return(list(dfPlot_estimates=hr_estimates, plot_estimates=plot_estimates))
} 

Strong biomarker effect (prognostic effect bx=log5)

Simulations under strong biomarker effect

Code
# Run sims and store results
t.start<-proc.time()[3]
# DGM 
#log.hrs <- log(c(2.5,1.25,0.50))
# example 2
log.hrs <- log(c(3,1.25,0.50))

dgm <- get_dgm_stratified(df=df.case,zeta=10,knot=5,log.hrs=log.hrs,strata_tte=NULL,details=TRUE)

Code
pop_summary <- draw_sim_stratified(dgm=dgm,ss=1,Ndraw=100000,wname="AP_region",
                                   bw=-log(5),strata_rand="stratum",checking=FALSE,details=FALSE,return_df=FALSE)

# True causal AHR
ahr_true = c(round(pop_summary$AHR,4))
# AHR Cox un-adjusted (only treatment)
ahr_unadj = c(round(pop_summary$ITT_unadj,4))
# Treatment plus randomization stratificaton sR
ahr_sR = c(round(pop_summary$ITT_sR,4))
# sR including adjustment by x
ahr_sRw = c(round(pop_summary$ITT_sRw,4))
# Relative biases 
bias_unadj <-round(100*c(pop_summary$ITT_unadj-pop_summary$AHR)/pop_summary$AHR,1)
bias_sR <-round(100*c(pop_summary$ITT_sR-pop_summary$AHR)/pop_summary$AHR,1)
bias_sRw <-round(100*c(pop_summary$ITT_sRw-pop_summary$AHR)/pop_summary$AHR,1)
# The bias within W=1 population
ahr_w1 = c(round(pop_summary$W_1,4))
bias_w1 <-round(100*c(pop_summary$W_1-pop_summary$AHR)/pop_summary$AHR,1)

cat("AHR true (causal) and within AP:",c(ahr_true,ahr_w1),"\n")
AHR true (causal) and within AP: 0.7368 1.016 
Code
n_sims <- nsims

library(doFuture)
library(doRNG)
registerDoFuture()
registerDoRNG()

mrct_sims <- mrct_APregion_sims(dgm=dgm,n_sims=n_sims,wname="AP_region",
                                bw=-log(5),sg_focus="minSG",details=TRUE)
Simulation subgroup found: 1 {biomarker < 2} 
Simulation subgroup found: 2 {biomarker < 2} 
Simulation subgroup found: 3 {biomarker < 2} 
Simulation subgroup found: 4 {biomarker < 2} 
Simulation subgroup found: 5 {biomarker < 2} 
Simulation subgroup found: 6 {biomarker < 2} 
Simulation subgroup found: 7 {biomarker < 2} 
Simulation subgroup found: 8 {biomarker < 2} 
Simulation subgroup found: 9 {biomarker < 2} 
Simulation subgroup found: 10 {biomarker < 2} 
Minutes for simulations 1000 2.969317 
Projection per 1000 2.969317 
Propn subgroups found = 1 
Code
t.now<-proc.time()[3]
t.min<-(t.now-t.start)/60
cat("Minutes (doFuture) simulations",c(t.min),"\n")
Minutes (doFuture) simulations 2.984367 
Code
cat("How often was a subgroup identified=",c(mean(mrct_sims$any_found)),"\n")
How often was a subgroup identified= 1 
Code
if(output) save(n_sims,pop_summary,dgm,mrct_sims,file="output/mrct_sims3_v0.Rdata")
Code
if(loadit) load(file="output/mrct_sims3_v0.Rdata")

# Look at large-sample spline in non-AP
dflarge <- draw_sim_stratified(dgm=dgm,ss=1,Ndraw=10000,wname="AP_region",bw=-log(5),
                               strata_rand="stratum",checking=FALSE,details=FALSE,return_df=TRUE)

temp <- cox_cs_fit(df=subset(dflarge,AP_region==0),tte.name="y.sim",event.name="event.sim",treat.name="treat.sim",
strata.name="strata.simO",z.name=c("z"), details=FALSE ,boots=0, xlab=c("Biomarker"),maxz = 25, ydel=0.5, cex_legend=0.5, show_plot=TRUE,truebeta.name="loghr.po",
main_title=c("Non-AP large sample ~ training estimation properties"))

Code
true_pos <- round(100*c(mean(mrct_sims$any_found)),4)

temp <- summaryout_mrct(pop_summary=pop_summary, mrct_sims=mrct_sims,showtable=FALSE)

ahr_true <- c(temp$res$ahr_true)
ahr_unadj <- c(temp$res$ahr_unadj)
ahr_sR <- c(temp$res$ahr_sR)
ahr_sRw <- c(temp$res$ahr_sRw)
ahr_w1 <- c(temp$res$ahr_w1)
  • The underlying true ITT causal hazard ratio is 0.7368
    • The large-sample (single draw) approximation to ITT un-adjusted is 0.7904
    • The large-sample ~ to ITT stratified is 0.7641
    • The large-sample ~ to ITT stratified + x-adjustment is 0.7458
    • The large-sample ~ to AP region analysis is 1.016

The true-positive rate (identifying a subgroup under HTE (heterogeneous treatment effects))

  • Across 1,000 simulations (a subgroup was identified) = 100%

Summary of subgroups identified and estimation properties are:

Code
# Show the table
temp$out_table
Identified subgroups and estimation summaries under biomarker effects: 1-factor combinations
Overall
(N=1000)
hr_itt
Mean (SD) 0.778 (0.0683)
Median [Min, Max] 0.775 [0.564, 1.03]
hr_ittX
Mean (SD) 0.764 (0.0662)
Median [Min, Max] 0.763 [0.555, 1.01]
hr_test
Mean (SD) 1.05 (0.266)
Median [Min, Max] 1.03 [0.431, 2.50]
found
1 1000 (100%)
prev_sg
Mean (SD) 0.800 (0.00613)
Median [Min, Max] 0.800 [0.688, 0.800]
hr_sg
Mean (SD) 0.825 (0.244)
Median [Min, Max] 0.802 [0.288, 2.13]
POhr_sg
Mean (SD) 0.536 (0.00878)
Median [Min, Max] 0.536 [0.536, 0.696]
sg_found
!{age <= 68} 3 (0.3%)
{biomarker < 2} 997 (99.7%)
sg_biomarker
biomarker 997 (99.7%)
other 3 (0.3%)
sg_age
age 3 (0.3%)
other 997 (99.7%)
APflag
AP <=0.9 313 (31.3%)
AP > 0.9 687 (68.7%)
sg_le85
AP(sg)<=0.85 607 (60.7%)
AP(sg)>0.85 393 (39.3%)
APflag2
AP > 0.9 & AP(sg) <= 0.85 297 (29.7%)
Not 703 (70.3%)
APflag3
AP > 0.9 & AP(sg) <= 0.80 192 (19.2%)
Not 808 (80.8%)
hr_sg_null
Mean (SD) 0.825 (0.244)
Median [Min, Max] 0.802 [0.288, 2.13]
Code
temp <- SGplot_estimates(df=mrct_sims, label_training="Non-AP, ITT", label_itt="Overall, ITT", label_testing="AP, ITT", label_sg="AP, identified subgroup")
plot(temp$plot_estimates)

Null model where benefit is uniform (prognostic effect bx=log5)

Simulations under the null where treatment benefit is uniform; Treatment does not depend on any baseline factors (hazard-ratio is 0.70)

Code
# Increase simulations to 5,000 under "null"
t.start<-proc.time()[3]

# Null uniform benefit hr=0.7
log.hrs <- log(c(0.7,0.7,0.7))
dgm <- get_dgm_stratified(df=df.case,zeta=10,knot=5,log.hrs=log.hrs,strata_tte=NULL,details=TRUE)

Code
pop_summary <- draw_sim_stratified(dgm=dgm,ss=1,Ndraw=100000,wname="AP_region",
                                   bw=log(5),strata_rand="stratum",checking=FALSE,details=FALSE,return_df=FALSE)

# True causal AHR
ahr_true = c(round(pop_summary$AHR,4))
# AHR Cox un-adjusted (only treatment)
ahr_unadj = c(round(pop_summary$ITT_unadj,4))
# Treatment plus randomization stratificaton sR
ahr_sR = c(round(pop_summary$ITT_sR,4))
# sR including adjustment by x
ahr_sRw = c(round(pop_summary$ITT_sRw,4))
# Relative biases 
bias_unadj <-round(100*c(pop_summary$ITT_unadj-pop_summary$AHR)/pop_summary$AHR,1)
bias_sR <-round(100*c(pop_summary$ITT_sR-pop_summary$AHR)/pop_summary$AHR,1)
bias_sRw <-round(100*c(pop_summary$ITT_sRw-pop_summary$AHR)/pop_summary$AHR,1)
# The bias within X=1 population
ahr_w1 = c(round(pop_summary$W_1,4))
bias_w1 <-round(100*c(pop_summary$W_1-pop_summary$AHR)/pop_summary$AHR,1)

cat("AHR true (causal) and within AP:",c(ahr_true,ahr_w1),"\n")
AHR true (causal) and within AP: 0.7 0.7073 
Code
n_sims <- nsims

library(doFuture)
library(doRNG)
registerDoFuture()
registerDoRNG()


mrct_sims <- mrct_APregion_sims(dgm=dgm,n_sims=n_sims,wname="AP_region",
                                bw=-log(5),sg_focus="minSG",details=TRUE)
Simulation subgroup found: 6 !{ecog1} 
Minutes for simulations 1000 4.606733 
Projection per 1000 4.606733 
Propn subgroups found = 1 
Code
t.now<-proc.time()[3]
t.min<-(t.now-t.start)/60
cat("Minutes (doFuture) simulations",c(t.min),"\n")
Minutes (doFuture) simulations 4.62225 
Code
cat("How often was a subgroup identified=",c(mean(mrct_sims$any_found)),"\n")
How often was a subgroup identified= 0.099 
Code
if(output) save(n_sims,pop_summary,dgm,mrct_sims,file="output/mrct_sims03_v0.Rdata")
Code
if(loadit) load(file="output/mrct_sims03_v0.Rdata")

false_pos <- round(100*c(mean(mrct_sims$any_found)),4)

# Look at large-sample spline in non-AP
dflarge <- draw_sim_stratified(dgm=dgm,ss=1,Ndraw=10000,wname="AP_region",bw=log(5),
                               strata_rand="stratum",checking=FALSE,details=FALSE,return_df=TRUE)

temp <- cox_cs_fit(df=subset(dflarge,AP_region==0),tte.name="y.sim",event.name="event.sim",treat.name="treat.sim",
strata.name="strata.simO",z.name=c("z"), details=FALSE ,boots=0, xlab=c("Biomarker"),maxz = 25, ydel=0.5, cex_legend=0.5, show_plot=TRUE,truebeta.name="loghr.po",
main_title=c("Non-AP large sample ~ training estimation properties"))

Code
# Note, here too many individual subgroup cuts, set sg_type=2
temp <- summaryout_mrct(pop_summary=pop_summary, mrct_sims=mrct_sims,showtable=FALSE, 
                        sg_type=2, tab_caption=c("Identified subgroups and estimation summaries under uniform treatment effects: 1-factor combinations"))

ahr_true <- c(temp$res$ahr_true)
ahr_unadj <- c(temp$res$ahr_unadj)
ahr_sR <- c(temp$res$ahr_sR)
ahr_sRw <- c(temp$res$ahr_sRw)
ahr_w1 <- c(temp$res$ahr_w1)
  • The underlying true ITT causal hazard ratio is 0.7
    • The large-sample (single draw) approximation to ITT un-adjusted is 0.7422
    • The large-sample ~ to ITT stratified is 0.7049
    • The large-sample ~ to ITT stratified + x-adjustment is 0.7038
    • The large-sample ~ to AP region analysis is 0.7073

The false-positive rate (identifying a subgroup under uniform benefit)

  • Across 1,000 simulations (a subgroup was falsely identified) = 9.9%

Summary of subgroups identified and estimation properties are:

Code
# Show the table
temp$out_table
Identified subgroups and estimation summaries under uniform treatment effects: 1-factor combinations
Overall
(N=1000)
hr_itt
Mean (SD) 0.708 (0.0689)
Median [Min, Max] 0.704 [0.491, 0.996]
hr_ittX
Mean (SD) 0.708 (0.0675)
Median [Min, Max] 0.705 [0.493, 1.01]
hr_test
Mean (SD) 0.733 (0.208)
Median [Min, Max] 0.712 [0.271, 1.80]
found
0 901 (90.1%)
1 99 (9.9%)
prev_sg
Mean (SD) 0.965 (0.122)
Median [Min, Max] 1.00 [0, 1.00]
hr_sg
Mean (SD) 0.739 (0.233)
Median [Min, Max] 0.715 [0.261, 3.67]
Missing 2 (0.2%)
POhr_sg
Mean (SD) 0.700 (0)
Median [Min, Max] 0.700 [0.700, 0.700]
Missing 1 (0.1%)
sg_biomarker
biomarker 20 (2.0%)
none 901 (90.1%)
other 79 (7.9%)
sg_age
age 41 (4.1%)
none 901 (90.1%)
other 58 (5.8%)
sg_male
male 10 (1.0%)
none 901 (90.1%)
other 89 (8.9%)
sg_ecog1
ecog1 7 (0.7%)
none 901 (90.1%)
other 92 (9.2%)
sg_histology
histology 3 (0.3%)
none 901 (90.1%)
other 96 (9.6%)
sg_CTregimen
mAD_CT 2 (0.2%)
none 901 (90.1%)
other 97 (9.7%)
sg_EU
EU region 4 (0.4%)
none 901 (90.1%)
other 95 (9.5%)
sg_surgery
none 901 (90.1%)
other 87 (8.7%)
surgery 12 (1.2%)
APflag
AP <=0.9 796 (79.6%)
AP > 0.9 204 (20.4%)
sg_le85
AP(sg)<=0.85 732 (73.2%)
AP(sg)>0.85 266 (26.6%)
Missing 2 (0.2%)
APflag2
AP > 0.9 & AP(sg) <= 0.85 3 (0.3%)
Not 997 (99.7%)
APflag3
AP > 0.9 & AP(sg) <= 0.80 2 (0.2%)
Not 998 (99.8%)
hr_sg_null
Mean (SD) 0.798 (0.392)
Median [Min, Max] 0.745 [0.261, 3.67]
Missing 903 (90.3%)
Code
temp <- SGplot_estimates(df=mrct_sims, label_training="Non-AP, ITT", label_itt="Overall, ITT", label_testing="AP, ITT", label_sg="AP, identified subgroup")
plot(temp$plot_estimates)

Null model where benefit is uniform (prognostic effect bx=log5): maxk=2

Simulations under the null where treatment benefit is uniform (maxk=2); Treatment does not depend on any baseline factors (hazard-ratio is 0.7)

Note: Allowing for 2-factor combinations may require more stringent hazard-ratio thresholds; Here we increase to 1.25 (1.0) as in paper

Code
t.start<-proc.time()[3]

# Null uniform benefit hr=0.7
log.hrs <- log(c(0.7,0.7,0.7))
dgm <- get_dgm_stratified(df=df.case,zeta=10,knot=5,log.hrs=log.hrs,strata_tte=NULL,details=TRUE)

Code
pop_summary <- draw_sim_stratified(dgm=dgm,ss=1,Ndraw=100000,wname="AP_region",
                                   bw=-log(5),strata_rand="stratum",checking=FALSE,details=FALSE,return_df=FALSE)

# True causal AHR
ahr_true = c(round(pop_summary$AHR,4))
# AHR Cox un-adjusted (only treatment)
ahr_unadj = c(round(pop_summary$ITT_unadj,4))
# Treatment plus randomization stratificaton sR
ahr_sR = c(round(pop_summary$ITT_sR,4))
# sR including adjustment by x
ahr_sRw = c(round(pop_summary$ITT_sRw,4))
# Relative biases 
bias_unadj <-round(100*c(pop_summary$ITT_unadj-pop_summary$AHR)/pop_summary$AHR,1)
bias_sR <-round(100*c(pop_summary$ITT_sR-pop_summary$AHR)/pop_summary$AHR,1)
bias_sRw <-round(100*c(pop_summary$ITT_sRw-pop_summary$AHR)/pop_summary$AHR,1)
# The bias within X=1 population
ahr_w1 = c(round(pop_summary$W_1,4))
bias_w1 <-round(100*c(pop_summary$W_1-pop_summary$AHR)/pop_summary$AHR,1)

cat("AHR true (causal) and within AP:",c(ahr_true,ahr_w1),"\n")
AHR true (causal) and within AP: 0.7 0.7173 
Code
n_sims <- nsims

library(doFuture)
library(doRNG)
registerDoFuture()
registerDoRNG()

mrct_sims <- mrct_APregion_sims(dgm=dgm,n_sims=n_sims,wname="AP_region",bw=-log(5),
sg_focus="hr", maxk=2, hr.threshold=1.25, hr.consistency=1.0, details=TRUE)
Simulation subgroup found: 10 {EU_region} & {age <= 59.6} 
Minutes for simulations 1000 3.874667 
Projection per 1000 3.874667 
Propn subgroups found = 1 
Code
t.now<-proc.time()[3]
t.min<-(t.now-t.start)/60
cat("Minutes (doFuture) simulations",c(t.min),"\n")
Minutes (doFuture) simulations 3.886633 
Code
cat("How often was a subgroup identified=",c(mean(mrct_sims$any_found)),"\n")
How often was a subgroup identified= 0.076 
Code
if(output) save(n_sims,pop_summary,dgm,mrct_sims,file="output/mrct_sims03_k2_v0.Rdata")
Code
if(loadit) load(file="output/mrct_sims03_k2_v0.Rdata")
false_pos <- round(100*c(mean(mrct_sims$any_found)),4)

cat("What do a few non-null look like","\n")
What do a few non-null look like 
Code
head(mrct_sims$sg_found[which(mrct_sims$any_found==1)])
[1] "{EU_region} & {age <= 59.6}"         "!{biomarker < 7} & !{histology}"    
[3] "!{mAD_CTregimen1} & {biomarker < 6}" "!{ecog1} & !{biomarker < 4}"        
[5] "!{age <= 65} & {biomarker < 8}"      "{surgery} & !{biomarker < 5}"       
Code
# Look at large-sample spline in non-AP
# This is same as above so do not repeat
# dflarge <- draw_sim_stratified(dgm=dgm,ss=1,Ndraw=10000,xname="AP_region",bx=log(5),strata_rand="stratum",checking=FALSE,details=FALSE,return_df=TRUE)
# temp <- cox_cs_fit(df=subset(dflarge,AP_region==0),tte.name="y.sim",event.name="event.sim",treat.name="treat.sim",
# strata.name="strata.simO",z.name=c("z"), details=FALSE ,boots=0, xlab=c("Biomarker"),maxz = 25, ydel=0.5, cex_legend=0.5, show_plot=TRUE,truebeta.name="loghr.po",
# main_title=c("Non-AP large sample ~ training estimation properties"))

# Note, here too many individual subgroup cuts, set sg_type=2
temp <- summaryout_mrct(pop_summary=pop_summary, mrct_sims=mrct_sims,showtable=FALSE, 
                        sg_type=2, tab_caption=c("Identified subgroups and estimation summaries under uniform treatment effects: 2-factor combinations"))

ahr_true <- c(temp$res$ahr_true)
ahr_unadj <- c(temp$res$ahr_unadj)
ahr_sR <- c(temp$res$ahr_sR)
ahr_sRw <- c(temp$res$ahr_sRw)
ahr_w1 <- c(temp$res$ahr_w1)
  • The underlying true ITT causal hazard ratio is 0.7
    • The large-sample (single draw) approximation to ITT un-adjusted is 0.7619
    • The large-sample ~ to ITT stratified is 0.706
    • The large-sample ~ to ITT stratified + x-adjustment is 0.7056
    • The large-sample ~ to AP region analysis is 0.7173

The false-positive rate (identifying a subgroup under uniform benefit)

  • Across 1,000 simulations (a subgroup was falsely identified) = 7.6%

Summary of subgroups identified and estimation properties are:

Code
# Show the table
temp$out_table
Identified subgroups and estimation summaries under uniform treatment effects: 2-factor combinations
Overall
(N=1000)
hr_itt
Mean (SD) 0.708 (0.0689)
Median [Min, Max] 0.704 [0.491, 0.996]
hr_ittX
Mean (SD) 0.708 (0.0675)
Median [Min, Max] 0.705 [0.493, 1.01]
hr_test
Mean (SD) 0.733 (0.208)
Median [Min, Max] 0.712 [0.271, 1.80]
found
0 924 (92.4%)
1 76 (7.6%)
prev_sg
Mean (SD) 0.986 (0.0641)
Median [Min, Max] 1.00 [0.360, 1.00]
hr_sg
Mean (SD) 0.732 (0.209)
Median [Min, Max] 0.711 [0.271, 1.80]
POhr_sg
Mean (SD) 0.700 (0)
Median [Min, Max] 0.700 [0.700, 0.700]
sg_biomarker
biomarker 41 (4.1%)
none 924 (92.4%)
other 35 (3.5%)
sg_age
age 48 (4.8%)
none 924 (92.4%)
other 28 (2.8%)
sg_male
male 10 (1.0%)
none 924 (92.4%)
other 66 (6.6%)
sg_ecog1
ecog1 9 (0.9%)
none 924 (92.4%)
other 67 (6.7%)
sg_histology
histology 7 (0.7%)
none 924 (92.4%)
other 69 (6.9%)
sg_CTregimen
mAD_CT 11 (1.1%)
none 924 (92.4%)
other 65 (6.5%)
sg_EU
EU region 13 (1.3%)
none 924 (92.4%)
other 63 (6.3%)
sg_surgery
none 924 (92.4%)
other 67 (6.7%)
surgery 9 (0.9%)
APflag
AP <=0.9 796 (79.6%)
AP > 0.9 204 (20.4%)
sg_le85
AP(sg)<=0.85 743 (74.3%)
AP(sg)>0.85 257 (25.7%)
APflag2
AP > 0.9 & AP(sg) <= 0.85 3 (0.3%)
Not 997 (99.7%)
APflag3
AP > 0.9 & AP(sg) <= 0.80 2 (0.2%)
Not 998 (99.8%)
hr_sg_null
Mean (SD) 0.736 (0.225)
Median [Min, Max] 0.717 [0.303, 1.19]
Missing 924 (92.4%)
Code
temp <- SGplot_estimates(df=mrct_sims, label_training="Non-AP, ITT", label_itt="Overall, ITT", label_testing="AP, ITT", label_sg="AP, identified subgroup")
plot(temp$plot_estimates)

Strong biomarker effect (prognostic effect bx=log5) maxk=2

Simulations under strong biomarker effect and stronger prognostic effect allowing for 2-factor combination (maxk=2)

Code
t.start<-proc.time()[3]

# DGM 
#log.hrs <- log(c(2,1.0,0.60))
log.hrs <- log(c(3,1.25,0.50))

dgm <- get_dgm_stratified(df=df.case,zeta=10,knot=5,log.hrs=log.hrs,strata_tte=NULL,details=TRUE)

Code
pop_summary <- draw_sim_stratified(dgm=dgm,ss=1,Ndraw=100000,wname="AP_region",bw=-log(5),
                                   strata_rand="stratum",checking=FALSE,details=FALSE,return_df=FALSE)

# True causal AHR
ahr_true = c(round(pop_summary$AHR,4))
# AHR Cox un-adjusted (only treatment)
ahr_unadj = c(round(pop_summary$ITT_unadj,4))
# Treatment plus randomization stratificaton sR
ahr_sR = c(round(pop_summary$ITT_sR,4))
# sR including adjustment by x
ahr_sRw = c(round(pop_summary$ITT_sRw,4))
# Relative biases 
bias_unadj <-round(100*c(pop_summary$ITT_unadj-pop_summary$AHR)/pop_summary$AHR,1)
bias_sR <-round(100*c(pop_summary$ITT_sR-pop_summary$AHR)/pop_summary$AHR,1)
bias_sRw <-round(100*c(pop_summary$ITT_sRw-pop_summary$AHR)/pop_summary$AHR,1)
# The bias within X=1 population
ahr_w1 = c(round(pop_summary$W_1,4))
bias_w1 <-round(100*c(pop_summary$W_1-pop_summary$AHR)/pop_summary$AHR,1)

cat("AHR true (causal) and within AP:",c(ahr_true,ahr_w1),"\n")
AHR true (causal) and within AP: 0.7368 1.016 
Code
n_sims <- nsims

library(doFuture)
library(doRNG)
registerDoFuture()
registerDoRNG()

mrct_sims <- mrct_APregion_sims(dgm=dgm,n_sims=n_sims,wname="AP_region",bw=-log(5),
sg_focus="hr",maxk=2, hr.threshold=1.25, hr.consistency=1.0, details=TRUE)
Simulation subgroup found: 1 {biomarker < 2} & {age <= 68} 
Simulation subgroup found: 2 {biomarker < 2} & !{age <= 52} 
Simulation subgroup found: 3 {age <= 61} & {biomarker < 4} 
Simulation subgroup found: 4 {age <= 59.6} & {biomarker < 4} 
Simulation subgroup found: 5 {biomarker < 5} & !{age <= 61} 
Simulation subgroup found: 6 {biomarker < 6} & !{ecog1} 
Simulation subgroup found: 7 {EU_region} & {biomarker < 5} 
Simulation subgroup found: 8 !{mAD_CTregimen1} & {biomarker < 3} 
Simulation subgroup found: 9 {biomarker < 2} & {age <= 65} 
Simulation subgroup found: 10 {mAD_CTregimen1} & {biomarker < 5} 
Minutes for simulations 1000 3.64125 
Projection per 1000 3.64125 
Propn subgroups found = 1 
Code
t.now<-proc.time()[3]
t.min<-(t.now-t.start)/60
cat("Minutes (doFuture) simulations",c(t.min),"\n")
Minutes (doFuture) simulations 3.653617 
Code
cat("How often was a subgroup identified=",c(mean(mrct_sims$any_found)),"\n")
How often was a subgroup identified= 1 
Code
if(output) save(n_sims,pop_summary,dgm,mrct_sims,file="output/mrct_sims3_k2_v0.Rdata")
Code
if(loadit) load(file="output/mrct_sims3_k2_v0.Rdata")
true_pos <- round(100*c(mean(mrct_sims$any_found)),4)

cat("What do a few non-null look like","\n")
What do a few non-null look like 
Code
head(mrct_sims$sg_found[which(mrct_sims$any_found==1)])
[1] "{biomarker < 2} & {age <= 68}"   "{biomarker < 2} & !{age <= 52}" 
[3] "{age <= 61} & {biomarker < 4}"   "{age <= 59.6} & {biomarker < 4}"
[5] "{biomarker < 5} & !{age <= 61}"  "{biomarker < 6} & !{ecog1}"     
Code
# Note, here too many individual subgroup cuts, set sg_type=2
temp <- summaryout_mrct(pop_summary=pop_summary, mrct_sims=mrct_sims,showtable=FALSE, sg_type=2, 
                        tab_caption=c("Identified subgroups and estimation summaries under biomarker effects: 2-factor combinations"))

ahr_true <- c(temp$res$ahr_true)
ahr_unadj <- c(temp$res$ahr_unadj)
ahr_sR <- c(temp$res$ahr_sR)
ahr_sRw <- c(temp$res$ahr_sRw)
ahr_w1 <- c(temp$res$ahr_w1)
  • The underlying true ITT causal hazard ratio is 0.7368
    • The large-sample (single draw) approximation to ITT un-adjusted is 0.7904
    • The large-sample ~ to ITT stratified is 0.7641
    • The large-sample ~ to ITT stratified + x-adjustment is 0.7458
    • The large-sample ~ to AP region analysis is 1.016

The true-positive rate (identifying a subgroup under strong biomarker effects)

  • Across 1,000 simulations (a subgroup was identified) = 100%

Summary of subgroups identified and estimation properties are:

Code
# Show the table
temp$out_table
Identified subgroups and estimation summaries under biomarker effects: 2-factor combinations
Overall
(N=1000)
hr_itt
Mean (SD) 0.778 (0.0683)
Median [Min, Max] 0.775 [0.564, 1.03]
hr_ittX
Mean (SD) 0.764 (0.0662)
Median [Min, Max] 0.763 [0.555, 1.01]
hr_test
Mean (SD) 1.05 (0.266)
Median [Min, Max] 1.03 [0.431, 2.50]
found
1 1000 (100%)
prev_sg
Mean (SD) 0.837 (0.103)
Median [Min, Max] 0.856 [0.384, 1.00]
hr_sg
Mean (SD) 0.875 (0.278)
Median [Min, Max] 0.846 [0.223, 2.13]
POhr_sg
Mean (SD) 0.578 (0.0982)
Median [Min, Max] 0.593 [0.185, 0.730]
sg_biomarker
biomarker 1000 (100%)
sg_age
age 421 (42.1%)
other 579 (57.9%)
sg_male
male 85 (8.5%)
other 915 (91.5%)
sg_ecog1
ecog1 46 (4.6%)
other 954 (95.4%)
sg_histology
histology 60 (6.0%)
other 940 (94.0%)
sg_CTregimen
mAD_CT 173 (17.3%)
other 827 (82.7%)
sg_EU
EU region 130 (13.0%)
other 870 (87.0%)
sg_surgery
other 919 (91.9%)
surgery 81 (8.1%)
APflag
AP <=0.9 313 (31.3%)
AP > 0.9 687 (68.7%)
sg_le85
AP(sg)<=0.85 505 (50.5%)
AP(sg)>0.85 495 (49.5%)
APflag2
AP > 0.9 & AP(sg) <= 0.85 214 (21.4%)
Not 786 (78.6%)
APflag3
AP > 0.9 & AP(sg) <= 0.80 159 (15.9%)
Not 841 (84.1%)
hr_sg_null
Mean (SD) 0.875 (0.278)
Median [Min, Max] 0.846 [0.223, 2.13]
Code
temp <- SGplot_estimates(df=mrct_sims, label_training="Non-AP, ITT", label_itt="Overall, ITT", label_testing="AP, ITT", label_sg="AP, identified subgroup")
plot(temp$plot_estimates)

Appendix

Brief Overview of Methods

  • forestSearch (León et al. (2024)):
    • Subgroups (SGs) with HR estimates indicative of “sub-optimal benefit” (e.g., \(hr \geq 1.0\) threshold), are considered candidates with a “splitting consistency criteria” for selection1.
      • In essence, if a subgroup that appears to derive the least benefit is identified, then the complement may potentially be considered to derive benefit with a “higher degree of confidence” relative to the ITT population
    • SGs are formed by single factors (eg, SG1 = males, SG2 = age<65) and two-way combinations (e.g., SG3 = males & age<65)
    • SGs with a minimum size of \(60\) and with a minimum of number of 10 events in each treatment arm will be considered
    • Reversing the role of treatment allows for identifying subgroups with “enhanced benefit” (e.g., \(hr \leq 0.65\) threshold)
    • Continuous factors are cut at either medians (q2), or quartiles (q1, q2, q3, say)
    • Cuts are also identified by generalized random forests (GRF) which is an identification approach itself –
  • Generalized random forests (Athey, Tibshirani, and Wager (2019), Cui et al. (2023)) which is based on restricted mean survival time (RMST) comparisons as implemented in the R package Tibshirani et al. (2022):
    • An RMST benefit of (at least) \(3\) months for control is required for selection of a subgroup, where among tree depths of 1 and 2 the subgroup with the largest RMST benefit (\(\geq 3\) months in favor of control) is selected
    • Similar to forestSearch, we are targeting relatively large effects
    • SGs with sample sizes of at least \(60\) participants and a maximum tree depth of 2 is required

Example of bootstrap adjusted estimates and CIs

Example of bootstrap de-biased estimator and CI estimation. Note that this is relevant for inference with regard to the training population – the same population from which subgroups were identified. However, if the AP data will also be used for identifying the subgroups then would be applicable.

Code
fs_minSG <- forestsearch(df.analysis=df_nonAP, confounders.name=confounders.name,
outcome.name="y.sim", treat.name="treat.sim", event.name="event.sim",
potentialOutcome.name="loghr.po",
hr.threshold = 0.90, hr.consistency = 0.90, pconsistency.threshold = 0.90,
showten_subgroups=FALSE, details=FALSE, sg_focus="minSG",
conf_force=c("age <= 65","biomarker < 1","biomarker < 2","biomarker < 3",
"biomarker < 4","biomarker < 5","biomarker < 6","biomarker < 7","biomarker < 8",
"biomarker < 9","biomarker < 10"), exclude_cuts=c("biomarker <="), maxk=1, plot.sg=FALSE)

library(doFuture)
library(doRNG)
registerDoFuture()
registerDoRNG()

t.start<-proc.time()[3]

NB <- nboots

# Bootstrap bias-correction 
fs_bc <- forestsearch_bootstrap_dofuture(fs.est=fs_minSG,nb_boots=NB,show_three=TRUE,details=TRUE)
Done with Ystar_mat 
# of continuous/categorical characteristics 2 7 
Continuous characteristics: age biomarker 
Categorical characteristics: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Default cuts included (1st 20) age <= mean(age) age <= median(age) age <= qlow(age) age <= qhigh(age) biomarker <= mean(biomarker) biomarker <= median(biomarker) biomarker <= qlow(biomarker) biomarker <= qhigh(biomarker) 
Categorical: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Dropping variables (only 1 level): AP_region 
Dropping variables (cut only has 1 level) biomarker < 1 
# of candidate subgroup factors= 20 
 [1] "age <= 65"      "biomarker < 2"  "biomarker < 3"  "biomarker < 4" 
 [5] "biomarker < 5"  "biomarker < 6"  "biomarker < 7"  "biomarker < 8" 
 [9] "biomarker < 9"  "biomarker < 10" "age <= 59.6"    "age <= 61"     
[13] "age <= 52.2"    "age <= 68"      "male"           "EU_region"     
[17] "histology"      "surgery"        "ecog1"          "mAD_CTregimen1"
Number of factors evaluated= 20 
Confounders per grf screening q8 q6 q7 q10 q9 q12 q16 q20 q17 q11 q19 q2 q13 q1 q18 q4 q3 q5 q14 q15 
          Factors Labels VI(grf)
8   biomarker < 8     q8  0.3001
6   biomarker < 6     q6  0.1526
7   biomarker < 7     q7  0.1003
10 biomarker < 10    q10  0.0698
9   biomarker < 9     q9  0.0694
12      age <= 61    q12  0.0532
16      EU_region    q16  0.0414
20 mAD_CTregimen1    q20  0.0323
17      histology    q17  0.0281
11    age <= 59.6    q11  0.0263
19          ecog1    q19  0.0259
2   biomarker < 2     q2  0.0171
13    age <= 52.2    q13  0.0141
1       age <= 65     q1  0.0124
18        surgery    q18  0.0113
4   biomarker < 4     q4  0.0110
3   biomarker < 3     q3  0.0103
5   biomarker < 5     q5  0.0093
14      age <= 68    q14  0.0081
15           male    q15  0.0071
Number of possible configurations (<= maxk): maxk, # <= maxk 1 40 
Approximately 5% of max_count met: minutes 1.666667e-05 
Approximately 10% of max_count met: minutes 6.666667e-05 
Approximately 20% of max_count met: minutes 2e-04 
Approximately 33% of max_count met: minutes 3e-04 
Approximately 50% of max_count met: minutes 0.0003833333 
Approximately 75% of max_count met: minutes 0.0005166667 
Approximately 90% of max_count met: minutes 0.0007333333 
# of subgroups evaluated based on (up to) maxk-factor combinations 40 
% of all-possible combinations (<= maxk) 100 
k.max= 1 
Events criteria for control,exp= 10 10 
# of subgroups with events less than criteria: control, experimental 0 0 
# of subgroups with sample size less than criteria 0 
# of subgroups meeting all criteria = 40 
# of subgroups fitted (Cox model estimable) = 40 
*Subgroup Searching Minutes=* 0.0008166667 
Number of subgroups meeting HR threshold 11 
Subgroup candidate(s) found (FS) 
# of candidate subgroups (meeting HR criteria) =  11 
SGs (1st 10) meeting screening thresholds sorted by sg_focus= minSG 
Kmet (found), Subgroup, % Consistency= 0 !{age <= 68} 0.35 
Consistency met! 
# of splits, Kmet= 1000 1 
Subgroup, % Consistency Met= {biomarker < 2} 0.998 
SG focus= minSG 
Subgroup Consistency Minutes= 0.0603 
Subgroup found (FS) 
Minutes forestsearch overall= 0.06251667 
Bootstrap done, B= 1 
# of continuous/categorical characteristics 2 7 
Continuous characteristics: age biomarker 
Categorical characteristics: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Default cuts included (1st 20) age <= mean(age) age <= median(age) age <= qlow(age) age <= qhigh(age) biomarker <= mean(biomarker) biomarker <= median(biomarker) biomarker <= qlow(biomarker) biomarker <= qhigh(biomarker) 
Categorical: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Dropping variables (only 1 level): AP_region 
# of candidate subgroup factors= 21 
 [1] "age <= 65"      "biomarker < 1"  "biomarker < 2"  "biomarker < 3" 
 [5] "biomarker < 4"  "biomarker < 5"  "biomarker < 6"  "biomarker < 7" 
 [9] "biomarker < 8"  "biomarker < 9"  "biomarker < 10" "age <= 59.9"   
[13] "age <= 62"      "age <= 53.2"    "age <= 68"      "male"          
[17] "EU_region"      "histology"      "surgery"        "ecog1"         
[21] "mAD_CTregimen1"
Number of factors evaluated= 21 
Confounders per grf screening q11 q10 q9 q8 q7 q20 q12 q17 q21 q3 q18 q13 q1 q14 q15 q4 q19 q6 q16 q5 q2 
          Factors Labels VI(grf)
11 biomarker < 10    q11  0.1728
10  biomarker < 9    q10  0.1711
9   biomarker < 8     q9  0.1425
8   biomarker < 7     q8  0.1345
7   biomarker < 6     q7  0.0768
20          ecog1    q20  0.0428
12    age <= 59.9    q12  0.0391
17      EU_region    q17  0.0369
21 mAD_CTregimen1    q21  0.0344
3   biomarker < 2     q3  0.0263
18      histology    q18  0.0257
13      age <= 62    q13  0.0191
1       age <= 65     q1  0.0171
14    age <= 53.2    q14  0.0131
15      age <= 68    q15  0.0119
4   biomarker < 3     q4  0.0114
19        surgery    q19  0.0076
6   biomarker < 5     q6  0.0058
16           male    q16  0.0056
5   biomarker < 4     q5  0.0053
2   biomarker < 1     q2  0.0000
Number of possible configurations (<= maxk): maxk, # <= maxk 1 42 
Approximately 5% of max_count met: minutes 5e-05 
Approximately 10% of max_count met: minutes 8.333333e-05 
Approximately 20% of max_count met: minutes 0.0001666667 
Approximately 33% of max_count met: minutes 0.00025 
Approximately 50% of max_count met: minutes 0.00035 
Approximately 75% of max_count met: minutes 0.0005166667 
Approximately 90% of max_count met: minutes 0.0006166667 
# of subgroups evaluated based on (up to) maxk-factor combinations 42 
% of all-possible combinations (<= maxk) 100 
k.max= 1 
Events criteria for control,exp= 10 10 
# of subgroups with events less than criteria: control, experimental 1 1 
# of subgroups with sample size less than criteria 1 
# of subgroups meeting all criteria = 40 
# of subgroups fitted (Cox model estimable) = 40 
*Subgroup Searching Minutes=* 0.0006666667 
Number of subgroups meeting HR threshold 11 
Subgroup candidate(s) found (FS) 
# of candidate subgroups (meeting HR criteria) =  11 
SGs (1st 10) meeting screening thresholds sorted by sg_focus= minSG 
Kmet (found), Subgroup, % Consistency= 0 !{age <= 68} 0.83 
Kmet (found), Subgroup, % Consistency= 0 {age <= 53.2} 0.151 
Consistency met! 
# of splits, Kmet= 1000 1 
Subgroup, % Consistency Met= {biomarker < 2} 0.944 
SG focus= minSG 
Subgroup Consistency Minutes= 0.09795 
Subgroup found (FS) 
Minutes forestsearch overall= 0.1026333 
Bootstrap done, B= 2 
# of continuous/categorical characteristics 2 7 
Continuous characteristics: age biomarker 
Categorical characteristics: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Default cuts included (1st 20) age <= mean(age) age <= median(age) age <= qlow(age) age <= qhigh(age) biomarker <= mean(biomarker) biomarker <= median(biomarker) biomarker <= qlow(biomarker) biomarker <= qhigh(biomarker) 
Categorical: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Dropping variables (only 1 level): AP_region 
# of candidate subgroup factors= 21 
 [1] "age <= 65"      "biomarker < 1"  "biomarker < 2"  "biomarker < 3" 
 [5] "biomarker < 4"  "biomarker < 5"  "biomarker < 6"  "biomarker < 7" 
 [9] "biomarker < 8"  "biomarker < 9"  "biomarker < 10" "age <= 60.5"   
[13] "age <= 62"      "age <= 53"      "age <= 69"      "male"          
[17] "EU_region"      "histology"      "surgery"        "ecog1"         
[21] "mAD_CTregimen1"
Number of factors evaluated= 21 
Confounders per grf screening q11 q10 q8 q7 q9 q17 q15 q21 q14 q20 q12 q18 q13 q3 q1 q16 q6 q4 q5 q19 q2 
          Factors Labels VI(grf)
11 biomarker < 10    q11  0.1607
10  biomarker < 9    q10  0.1580
8   biomarker < 7     q8  0.1183
7   biomarker < 6     q7  0.1063
9   biomarker < 8     q9  0.1047
17      EU_region    q17  0.0507
15      age <= 69    q15  0.0422
21 mAD_CTregimen1    q21  0.0379
14      age <= 53    q14  0.0337
20          ecog1    q20  0.0272
12    age <= 60.5    q12  0.0256
18      histology    q18  0.0237
13      age <= 62    q13  0.0230
3   biomarker < 2     q3  0.0168
1       age <= 65     q1  0.0152
16           male    q16  0.0141
6   biomarker < 5     q6  0.0139
4   biomarker < 3     q4  0.0112
5   biomarker < 4     q5  0.0091
19        surgery    q19  0.0078
2   biomarker < 1     q2  0.0000
Number of possible configurations (<= maxk): maxk, # <= maxk 1 42 
Approximately 5% of max_count met: minutes 5e-05 
Approximately 10% of max_count met: minutes 0.0002166667 
Approximately 20% of max_count met: minutes 3e-04 
Approximately 33% of max_count met: minutes 4e-04 
Approximately 50% of max_count met: minutes 0.0004833333 
Approximately 75% of max_count met: minutes 0.00065 
Approximately 90% of max_count met: minutes 0.00075 
# of subgroups evaluated based on (up to) maxk-factor combinations 42 
% of all-possible combinations (<= maxk) 100 
k.max= 1 
Events criteria for control,exp= 10 10 
# of subgroups with events less than criteria: control, experimental 1 1 
# of subgroups with sample size less than criteria 1 
# of subgroups meeting all criteria = 40 
# of subgroups fitted (Cox model estimable) = 40 
*Subgroup Searching Minutes=* 8e-04 
Number of subgroups meeting HR threshold 10 
Subgroup candidate(s) found (FS) 
# of candidate subgroups (meeting HR criteria) =  10 
SGs (1st 10) meeting screening thresholds sorted by sg_focus= minSG 
Kmet (found), Subgroup, % Consistency= 0 !{age <= 69} 0.834 
Consistency met! 
# of splits, Kmet= 1000 1 
Subgroup, % Consistency Met= {biomarker < 2} 0.926 
SG focus= minSG 
Subgroup Consistency Minutes= 0.07008333 
Subgroup found (FS) 
Minutes forestsearch overall= 0.07511667 
Bootstrap done, B= 3 
Minutes for Boots 1000 4.509833 
Projection per 1000 4.509833 
Propn bootstrap subgroups found = 0.997 
Number timmed out= 0 
Avg proportion of maxk searched 100 
H un-adjusted estimates-----:    1.66 (1.09,2.52) 
H bias-corrected estimates--:    1.59 (0.92,2.77) 
H^c un-adjusted estimates---:    0.55 (0.42,0.71) 
H^c bias-corrected estimates:    0.55 (0.38,0.8) 
Code
t.now<-proc.time()[3]
t.min<-(t.now-t.start)/60
cat("Minutes (doFuture) bootstrap",c(t.min),"\n")
Minutes (doFuture) bootstrap 4.5121 
Code
if(output) save(fs_minSG, fs_bc, NB, file="output/boot3_minSG_k1_v0.Rdata")
Code
if(loadit) load(file="output/boot3_minSG_k1_v0.Rdata")

gt(as.data.frame(fs_bc$FSsg_tab), auto_align=FALSE) 
Subgroup n n1 events m1 m0 RMST HR (95% CI) HR* AHR(po)
Questionable 95 (25%) 46 (48%) 94 (99%) 6.2 8.2 -3.77 (-7.14,-0.39) 1.66 (1.09,2.52) 1.59 (0.92,2.77) 2.52
Recommend 287 (75%) 143 (50%) 249 (87%) 16 11.8 7.28 (3.68,10.87) 0.55 (0.42,0.71) 0.55 (0.38,0.8) 0.49
Code
dfnew <- copy(df_nonAP)
dfnew$control.sim <- 1-dfnew$treat.sim

fs_maxSG <- forestsearch(df.analysis=dfnew, confounders.name=confounders.name,
outcome.name="y.sim", treat.name="control.sim", event.name="event.sim",
potentialOutcome.name="loghr.po",
est.scale="1/hr", hr.threshold = 1/0.6, hr.consistency = 1/0.7, pconsistency.threshold = 0.90,
showten_subgroups=FALSE, details=TRUE, sg_focus="maxSG",
conf_force=c("age <= 65","biomarker < 1","biomarker < 2","biomarker < 3",
"biomarker < 4","biomarker < 5","biomarker < 6","biomarker < 7","biomarker < 8",
"biomarker < 9","biomarker < 10"), exclude_cuts=c("biomarker <="), maxk=1, plot.sg=FALSE)
# of continuous/categorical characteristics 2 7 
Continuous characteristics: age biomarker 
Categorical characteristics: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Default cuts included (1st 20) age <= mean(age) age <= median(age) age <= qlow(age) age <= qhigh(age) biomarker <= mean(biomarker) biomarker <= median(biomarker) biomarker <= qlow(biomarker) biomarker <= qhigh(biomarker) 
Categorical: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Dropping variables (only 1 level): AP_region 
# of candidate subgroup factors= 21 
 [1] "age <= 65"      "biomarker < 1"  "biomarker < 2"  "biomarker < 3" 
 [5] "biomarker < 4"  "biomarker < 5"  "biomarker < 6"  "biomarker < 7" 
 [9] "biomarker < 8"  "biomarker < 9"  "biomarker < 10" "age <= 59.6"   
[13] "age <= 61"      "age <= 52"      "age <= 68"      "male"          
[17] "EU_region"      "histology"      "surgery"        "ecog1"         
[21] "mAD_CTregimen1"
Number of factors evaluated= 21 
Confounders per grf screening q9 q8 q10 q11 q7 q21 q17 q18 q20 q13 q12 q14 q1 q3 q15 q6 q19 q4 q16 q5 q2 
          Factors Labels VI(grf)
9   biomarker < 8     q9  0.2150
8   biomarker < 7     q8  0.1574
10  biomarker < 9    q10  0.1192
11 biomarker < 10    q11  0.1122
7   biomarker < 6     q7  0.0765
21 mAD_CTregimen1    q21  0.0430
17      EU_region    q17  0.0347
18      histology    q18  0.0314
20          ecog1    q20  0.0299
13      age <= 61    q13  0.0284
12    age <= 59.6    q12  0.0242
14      age <= 52    q14  0.0206
1       age <= 65     q1  0.0182
3   biomarker < 2     q3  0.0167
15      age <= 68    q15  0.0146
6   biomarker < 5     q6  0.0131
19        surgery    q19  0.0116
4   biomarker < 3     q4  0.0111
16           male    q16  0.0111
5   biomarker < 4     q5  0.0110
2   biomarker < 1     q2  0.0000
Number of possible configurations (<= maxk): maxk, # <= maxk 1 42 
Approximately 5% of max_count met: minutes 5e-05 
Approximately 10% of max_count met: minutes 8.333333e-05 
Approximately 20% of max_count met: minutes 0.0001666667 
Approximately 33% of max_count met: minutes 0.0002333333 
Approximately 50% of max_count met: minutes 0.0003333333 
Approximately 75% of max_count met: minutes 0.0004833333 
Approximately 90% of max_count met: minutes 0.0005833333 
# of subgroups evaluated based on (up to) maxk-factor combinations 42 
% of all-possible combinations (<= maxk) 100 
k.max= 1 
Events criteria for control,exp= 10 10 
# of subgroups with events less than criteria: control, experimental 1 1 
# of subgroups with sample size less than criteria 1 
# of subgroups meeting all criteria = 40 
# of subgroups fitted (Cox model estimable) = 40 
*Subgroup Searching Minutes=* 0.0006333333 
Number of subgroups meeting HR threshold 10 
Subgroup candidate(s) found (FS) 
# of candidate subgroups (meeting HR criteria) =  10 
SGs (1st 10) meeting screening thresholds sorted by sg_focus= maxSG 
Consistency met! 
# of splits, Kmet= 1000 1 
Subgroup, % Consistency Met= !{biomarker < 2} 0.951 
SG focus= maxSG 
Subgroup Consistency Minutes= 0.0224 
Subgroup found (FS) 
Minutes forestsearch overall= 0.02458333 
Code
library(doFuture)
library(doRNG)
registerDoFuture()
registerDoRNG()

t.start<-proc.time()[3]

NB <- nboots

# Bootstrap bias-correction 
fs_bc <- forestsearch_bootstrap_dofuture(fs.est=fs_maxSG,nb_boots=NB,show_three=TRUE,details=TRUE)
Done with Ystar_mat 
# of continuous/categorical characteristics 2 7 
Continuous characteristics: age biomarker 
Categorical characteristics: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Default cuts included (1st 20) age <= mean(age) age <= median(age) age <= qlow(age) age <= qhigh(age) biomarker <= mean(biomarker) biomarker <= median(biomarker) biomarker <= qlow(biomarker) biomarker <= qhigh(biomarker) 
Categorical: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Dropping variables (only 1 level): AP_region 
Dropping variables (cut only has 1 level) biomarker < 1 
# of candidate subgroup factors= 20 
 [1] "age <= 65"      "biomarker < 2"  "biomarker < 3"  "biomarker < 4" 
 [5] "biomarker < 5"  "biomarker < 6"  "biomarker < 7"  "biomarker < 8" 
 [9] "biomarker < 9"  "biomarker < 10" "age <= 59.6"    "age <= 61"     
[13] "age <= 52.2"    "age <= 68"      "male"           "EU_region"     
[17] "histology"      "surgery"        "ecog1"          "mAD_CTregimen1"
Number of factors evaluated= 20 
Confounders per grf screening q8 q6 q7 q10 q9 q12 q16 q20 q17 q11 q19 q2 q13 q1 q18 q4 q3 q5 q14 q15 
          Factors Labels VI(grf)
8   biomarker < 8     q8  0.2998
6   biomarker < 6     q6  0.1533
7   biomarker < 7     q7  0.0993
10 biomarker < 10    q10  0.0701
9   biomarker < 9     q9  0.0694
12      age <= 61    q12  0.0530
16      EU_region    q16  0.0412
20 mAD_CTregimen1    q20  0.0326
17      histology    q17  0.0280
11    age <= 59.6    q11  0.0264
19          ecog1    q19  0.0260
2   biomarker < 2     q2  0.0174
13    age <= 52.2    q13  0.0140
1       age <= 65     q1  0.0124
18        surgery    q18  0.0114
4   biomarker < 4     q4  0.0110
3   biomarker < 3     q3  0.0104
5   biomarker < 5     q5  0.0093
14      age <= 68    q14  0.0081
15           male    q15  0.0069
Number of possible configurations (<= maxk): maxk, # <= maxk 1 40 
Approximately 5% of max_count met: minutes 3.333333e-05 
Approximately 10% of max_count met: minutes 8.333333e-05 
Approximately 20% of max_count met: minutes 0.0001666667 
Approximately 33% of max_count met: minutes 0.0002666667 
Approximately 50% of max_count met: minutes 0.0003666667 
Approximately 75% of max_count met: minutes 0.0005666667 
Approximately 90% of max_count met: minutes 0.0006833333 
# of subgroups evaluated based on (up to) maxk-factor combinations 40 
% of all-possible combinations (<= maxk) 100 
k.max= 1 
Events criteria for control,exp= 10 10 
# of subgroups with events less than criteria: control, experimental 0 0 
# of subgroups with sample size less than criteria 0 
# of subgroups meeting all criteria = 40 
# of subgroups fitted (Cox model estimable) = 40 
*Subgroup Searching Minutes=* 0.00075 
Number of subgroups meeting HR threshold 14 
Subgroup candidate(s) found (FS) 
# of candidate subgroups (meeting HR criteria) =  14 
SGs (1st 10) meeting screening thresholds sorted by sg_focus= maxSG 
Consistency met! 
# of splits, Kmet= 1000 1 
Subgroup, % Consistency Met= !{biomarker < 2} 0.983 
SG focus= maxSG 
Subgroup Consistency Minutes= 0.0348 
Subgroup found (FS) 
Minutes forestsearch overall= 0.03695 
Bootstrap done, B= 1 
# of continuous/categorical characteristics 2 7 
Continuous characteristics: age biomarker 
Categorical characteristics: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Default cuts included (1st 20) age <= mean(age) age <= median(age) age <= qlow(age) age <= qhigh(age) biomarker <= mean(biomarker) biomarker <= median(biomarker) biomarker <= qlow(biomarker) biomarker <= qhigh(biomarker) 
Categorical: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Dropping variables (only 1 level): AP_region 
# of candidate subgroup factors= 21 
 [1] "age <= 65"      "biomarker < 1"  "biomarker < 2"  "biomarker < 3" 
 [5] "biomarker < 4"  "biomarker < 5"  "biomarker < 6"  "biomarker < 7" 
 [9] "biomarker < 8"  "biomarker < 9"  "biomarker < 10" "age <= 59.9"   
[13] "age <= 62"      "age <= 53.2"    "age <= 68"      "male"          
[17] "EU_region"      "histology"      "surgery"        "ecog1"         
[21] "mAD_CTregimen1"
Number of factors evaluated= 21 
Confounders per grf screening q11 q10 q9 q8 q7 q20 q12 q17 q21 q3 q18 q13 q1 q14 q15 q4 q19 q6 q16 q5 q2 
          Factors Labels VI(grf)
11 biomarker < 10    q11  0.1728
10  biomarker < 9    q10  0.1676
9   biomarker < 8     q9  0.1425
8   biomarker < 7     q8  0.1342
7   biomarker < 6     q7  0.0810
20          ecog1    q20  0.0425
12    age <= 59.9    q12  0.0392
17      EU_region    q17  0.0365
21 mAD_CTregimen1    q21  0.0345
3   biomarker < 2     q3  0.0267
18      histology    q18  0.0259
13      age <= 62    q13  0.0191
1       age <= 65     q1  0.0170
14    age <= 53.2    q14  0.0131
15      age <= 68    q15  0.0121
4   biomarker < 3     q4  0.0115
19        surgery    q19  0.0075
6   biomarker < 5     q6  0.0056
16           male    q16  0.0056
5   biomarker < 4     q5  0.0053
2   biomarker < 1     q2  0.0000
Number of possible configurations (<= maxk): maxk, # <= maxk 1 42 
Approximately 5% of max_count met: minutes 5e-05 
Approximately 10% of max_count met: minutes 8.333333e-05 
Approximately 20% of max_count met: minutes 0.0001666667 
Approximately 33% of max_count met: minutes 0.0002333333 
Approximately 50% of max_count met: minutes 0.0003333333 
Approximately 75% of max_count met: minutes 0.0004833333 
Approximately 90% of max_count met: minutes 0.0005666667 
# of subgroups evaluated based on (up to) maxk-factor combinations 42 
% of all-possible combinations (<= maxk) 100 
k.max= 1 
Events criteria for control,exp= 10 10 
# of subgroups with events less than criteria: control, experimental 1 1 
# of subgroups with sample size less than criteria 1 
# of subgroups meeting all criteria = 40 
# of subgroups fitted (Cox model estimable) = 40 
*Subgroup Searching Minutes=* 0.0006833333 
Number of subgroups meeting HR threshold 8 
Subgroup candidate(s) found (FS) 
# of candidate subgroups (meeting HR criteria) =  8 
SGs (1st 10) meeting screening thresholds sorted by sg_focus= maxSG 
Consistency met! 
# of splits, Kmet= 1000 1 
Subgroup, % Consistency Met= !{biomarker < 3} 0.964 
SG focus= maxSG 
Subgroup Consistency Minutes= 0.03123333 
Subgroup found (FS) 
Minutes forestsearch overall= 0.03545 
Bootstrap done, B= 2 
# of continuous/categorical characteristics 2 7 
Continuous characteristics: age biomarker 
Categorical characteristics: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Default cuts included (1st 20) age <= mean(age) age <= median(age) age <= qlow(age) age <= qhigh(age) biomarker <= mean(biomarker) biomarker <= median(biomarker) biomarker <= qlow(biomarker) biomarker <= qhigh(biomarker) 
Categorical: male EU_region AP_region histology surgery ecog1 mAD_CTregimen1 
Dropping variables (only 1 level): AP_region 
# of candidate subgroup factors= 21 
 [1] "age <= 65"      "biomarker < 1"  "biomarker < 2"  "biomarker < 3" 
 [5] "biomarker < 4"  "biomarker < 5"  "biomarker < 6"  "biomarker < 7" 
 [9] "biomarker < 8"  "biomarker < 9"  "biomarker < 10" "age <= 60.5"   
[13] "age <= 62"      "age <= 53"      "age <= 69"      "male"          
[17] "EU_region"      "histology"      "surgery"        "ecog1"         
[21] "mAD_CTregimen1"
Number of factors evaluated= 21 
Confounders per grf screening q11 q10 q8 q7 q9 q17 q15 q21 q14 q20 q12 q18 q13 q3 q1 q6 q16 q4 q5 q19 q2 
          Factors Labels VI(grf)
11 biomarker < 10    q11  0.1622
10  biomarker < 9    q10  0.1580
8   biomarker < 7     q8  0.1159
7   biomarker < 6     q7  0.1076
9   biomarker < 8     q9  0.1043
17      EU_region    q17  0.0505
15      age <= 69    q15  0.0419
21 mAD_CTregimen1    q21  0.0379
14      age <= 53    q14  0.0339
20          ecog1    q20  0.0271
12    age <= 60.5    q12  0.0256
18      histology    q18  0.0238
13      age <= 62    q13  0.0230
3   biomarker < 2     q3  0.0171
1       age <= 65     q1  0.0152
6   biomarker < 5     q6  0.0142
16           male    q16  0.0142
4   biomarker < 3     q4  0.0111
5   biomarker < 4     q5  0.0092
19        surgery    q19  0.0074
2   biomarker < 1     q2  0.0000
Number of possible configurations (<= maxk): maxk, # <= maxk 1 42 
Approximately 5% of max_count met: minutes 5e-05 
Approximately 10% of max_count met: minutes 1e-04 
Approximately 20% of max_count met: minutes 0.0001666667 
Approximately 33% of max_count met: minutes 0.00025 
Approximately 50% of max_count met: minutes 0.00175 
Approximately 75% of max_count met: minutes 0.001916667 
Approximately 90% of max_count met: minutes 0.0021 
# of subgroups evaluated based on (up to) maxk-factor combinations 42 
% of all-possible combinations (<= maxk) 100 
k.max= 1 
Events criteria for control,exp= 10 10 
# of subgroups with events less than criteria: control, experimental 1 1 
# of subgroups with sample size less than criteria 1 
# of subgroups meeting all criteria = 40 
# of subgroups fitted (Cox model estimable) = 40 
*Subgroup Searching Minutes=* 0.002233333 
Number of subgroups meeting HR threshold 10 
Subgroup candidate(s) found (FS) 
# of candidate subgroups (meeting HR criteria) =  10 
SGs (1st 10) meeting screening thresholds sorted by sg_focus= maxSG 
Kmet (found), Subgroup, % Consistency= 0 {age <= 69} 0.843 
Kmet (found), Subgroup, % Consistency= 0 !{biomarker < 2} 0.851 
Consistency met! 
# of splits, Kmet= 1000 1 
Subgroup, % Consistency Met= !{biomarker < 3} 0.985 
SG focus= maxSG 
Subgroup Consistency Minutes= 0.1051167 
Subgroup found (FS) 
Minutes forestsearch overall= 0.1116833 
Bootstrap done, B= 3 
Minutes for Boots 1000 4.161817 
Projection per 1000 4.161817 
Propn bootstrap subgroups found = 1 
Number timmed out= 0 
Avg proportion of maxk searched 100 
H un-adjusted estimates-----:    0.55 (0.42,0.71) 
H bias-corrected estimates--:    0.56 (0.38,0.8) 
H^c un-adjusted estimates---:    1.66 (1.09,2.52) 
H^c bias-corrected estimates:    1.6 (0.93,2.74) 
Code
t.now<-proc.time()[3]
t.min<-(t.now-t.start)/60
cat("Minutes (doFuture) bootstrap",c(t.min),"\n")
Minutes (doFuture) bootstrap 4.162883 
Code
if(output) save(fs_maxSG, fs_bc, NB, file="output/boot3_maxSG_k1_v0.Rdata")
Code
if(loadit) load(file="output/boot3_maxSG_k1_v0.Rdata")

gt(as.data.frame(fs_bc$FSsg_tab), auto_align=FALSE) 
Subgroup n n1 events m1 m0 RMST HR (95% CI) HR* AHR(po)
Questionable 95 (25%) 46 (48%) 94 (99%) 6.2 8.2 -3.77 (-7.14,-0.39) 1.66 (1.09,2.52) 1.6 (0.93,2.74) 2.52
Recommend 287 (75%) 143 (50%) 249 (87%) 16 11.8 7.28 (3.68,10.87) 0.55 (0.42,0.71) 0.56 (0.38,0.8) 0.49
Code
tALL.now<-proc.time()[3]
t.min<-(tALL.now-tALL.start)/60
cat("Minutes for ALL Analyses",c(t.min),"\n")
Minutes for ALL Analyses 24.83348 

References

Athey, Susan, Julie Tibshirani, and Stefan Wager. 2019. “Generalized Random Forests.” The Annals of Statistics 47 (2): 1148–78.
Cui, Yifan, Michael R Kosorok, Erik Sverdrup, Stefan Wager, and Ruoqing Zhu. 2023. Estimating heterogeneous treatment effects with right-censored data via causal survival forests.” Journal of the Royal Statistical Society Series B: Statistical Methodology, February.
León, Larry F., Thomas Jemielita, Zifang Guo, Rachel Marceau West, and Keaven M. Anderson. 2024. “Exploratory Subgroup Identification in the Heterogeneous Cox Model: A Relatively Simple Procedure.” Statistics in Medicine 43 (20): 3921–42. https://doi.org/https://doi.org/10.1002/sim.10163.
Tibshirani, Julie, Susan Athey, Rina Friedberg, Vitor Hadad, David Hirshberg, Luke Miner, Erik Sverdrup, Stefan Wager, and Marvin Wright. 2022. “Package Grf.”

Footnotes

  1. “splitting consistency criteria” judges the robustness/realness↩︎